PolMine / RcppCWB

'Rcpp' Bindings for the 'Corpus Workbench' (CWB)
Other
2 stars 3 forks source link

gcc11.2/mingw stringop-overflow warning #45

Closed ablaette closed 2 years ago

ablaette commented 2 years ago
cdaccess.c: In function 'cl_read_stream':
cdaccess.c:982:5: warning: 'memcpy' specified bound between 18446744065119617024 and 18446744073709551612 exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=]
  982 |     memcpy(buffer, ps->base + ps->nr_items, items_to_read * sizeof(int));
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ablaette commented 2 years ago

The solution is to check for the size of the string to be copied, see this slight modification of cl_read_stream():

int
cl_read_stream(PositionStream ps, int *buffer, int buffer_size)
{
  int items_to_read, i;

  assert(ps);
  assert(buffer);

  /* return 0 if we have already read >= freq items */
  if (ps->nr_items >= ps->id_freq)
    return 0;

  if (ps->nr_items + buffer_size > ps->id_freq)
    items_to_read = ps->id_freq - ps->nr_items;
  else
    items_to_read = buffer_size;

  assert(items_to_read >= 0);

  if (items_to_read == 0)
    return 0;

  if (ps->is_compressed) {
    for (i = 0; i < items_to_read; i++, ps->nr_items++) {
      ps->last_pos += read_golomb_code_bs(ps->b, &(ps->bs));
      *buffer = ps->last_pos;
      buffer++;
    }
  }
  else {
    size_t k;
    k =  items_to_read * sizeof(int);
    if (k < PTRDIFF_MAX){    

      memcpy(buffer, ps->base + ps->nr_items, k);
      ps->nr_items += items_to_read;

      /* convert network byte order to native integers */
      for (i = 0; i < items_to_read; i++)
        buffer[i] = ntohl(buffer[i]);
      }
    }

  return items_to_read;
}

This discussion was very helpful to figure out this potential solution: https://stackoverflow.com/questions/47450718/gcc7-2-argument-range-exceeds-maximum-object-size-9-7-werror-alloc-size-larg Still needs to be implemented in PatchCWB class!

ablaette commented 2 years ago

This solution works and has been incorporated as a CWB patch.