seung-lab / fpzip

Cython bindings for fpzip, a floating point image compression algorithm.
BSD 3-Clause "New" or "Revised" License
33 stars 5 forks source link

fix: handle cases where compressed representation is larger than input array #21

Open bjude opened 4 weeks ago

bjude commented 4 weeks ago

The fpzip algorithm can sometimes require a larger buffer than the input array size (pigeonhole principle). The extra 4 bytes in header_bytes was enough to mostly hide this issue, but we ran into a buffer overflow exception in production. Instead of always using the input array size + header size as the compression buffer size, try with this size and incrementally expand the buffer if it's not enough.

In the overwhelmingly common case (compressed size < input size), no extra work is done, only if the buffer is too small do we do any extra work. The values for MAX_ATTEMPTS and BUFFER_GROWTH_FACTOR are somewhat arbitrary, but 5 and 1.5 seem good enough. It would only be truly pathological inputs that would require more than this.