isarandi / rlemasklib

Manipulate run-length encoded image masks
Other
6 stars 4 forks source link

Fix occasional out of bounds write in rleToString #5

Closed qrmt closed 1 month ago

qrmt commented 1 month ago

Fixes out-of-bounds write when encoding run lengths to string.

On line 1229 in rleToString, the string is null-terminated and reallocated:

    s[p++] = 0; // null-terminate the string
    return realloc(s, sizeof(char) * p);

However, sometimes p might be equal to allocation_size, as it's increased like this in the loop:

      if (p >= alloc_size) {
          alloc_size *= 2;
          s = realloc(s, alloc_size);
      }
      s[p++] = c + 48;  // ascii 48 is '0'. 48-111 is the range of ascii chars we use

If e.g. p = 119 and alloc_size = 120 at final loop, then p will be 120 as well and s[p++] = 0; will write out of bounds. Valgrind output:

==3487627== Invalid write of size 1
==3487627==    at 0x43EDA615: rleToString (rlemasklib.c:1229)

The fix is to reallocate memory first to p+1 and then null-terminating:

    s = realloc(s, sizeof(char) * (p+1)); // Move the realloc call here
    s[p] = 0; // null-terminate the string
    return s;

I ran a few thousand random numpy arrays through both pycocotools.mask.encode() and rlemasklib.encode() that both utilize this function to check that the results are correct.

isarandi commented 1 month ago

Thanks a lot for tracking this bug down!