LLNL / fpzip

Lossless compressor of multidimensional floating-point arrays
http://fpzip.llnl.gov
BSD 3-Clause "New" or "Revised" License
105 stars 16 forks source link

FR: support for __fp16 #2

Closed stolk closed 1 year ago

stolk commented 4 years ago

Are there any plans to support half precision floats, using the __fp16 type?

https://en.wikipedia.org/wiki/Half-precision_floating-point_format

lindstro commented 4 years ago

No immediate plans, but it would be straightforward to extend fpzip to support half precision. The one gotcha is the lack of language support, though __fp16 could be conditionally enabled for gcc. As a potential alternative, fpzip already supports bfloat16 by using 16-bit precision with 32-bit floats, but it currently requires conversion between 16- and 32-bit storage.

If there's enough demand, I would consider adding half precision support.

stolk commented 4 years ago

Thanks. Any thoughts on 16 bit integers? Would that be in anyway compatible with the approaches taken in fpzip, or is that fundamentally different?

lindstro commented 4 years ago

This would be rather easy to support since one of the first steps in fpzip is to convert floating-point values to integers. But there already exist numerous lossless compressors for 16-bit integers (PNG image compression, for instance). It's unclear that an fpzip variant would do significantly better.

stolk commented 4 years ago

Thank you.

I was under the impression that fpzip was able to cope with the 3D nature of my data, unlike PNG, ZH4, ZSTD, etc?

In my volume, there is smoothness to the neighbours in -x,+x,-y,+y,-z,+z directions. I doubt a convention 2d image compressor would be able to take advantage of.

I did investigate jpeg2000 jp3d, but I am seeing modest compression ratios from that, when doing lossless encoding. My volumes contain multi-octave domain-warped opensimplex noise, of which an isosurface through the volume can be seen here: https://www.reddit.com/r/proceduralgeneration/comments/hbrm3c/menagerie_of_noise/

lindstro commented 4 years ago

Sorry for a late response--somehow I didn't see your follow-up.

One simple idea you might try is to just cast your unsigned integer data to floats and fpzip compress them with 1 + 8 + 16 + 2 = 27 bits of precision. Why this precision? You need one bit for the sign, 8 for the floating-point exponent, 16 bits for the significand (your original 16 bits), and 2 bits to guard against range expansion in the predictor (in general, you need d - 1 guard bits for d-dimensional data). In practice, however, if the data is unsigned and reasonably smooth, the sign and guard bits aren't needed, and 24 bits of precision should be sufficient. This assumes that lossless compression is essential; if you can tolerate some loss, you might further reduce the precision.

You could also use the reversible mode of the zfp compressor, which supports lossless compression of 31-bit integers (and of floats). Use its utility functions for converting between 16- and 31-bit integers.

lindstro commented 1 year ago

@stolk This issue has been open for some time, with no immediate plans for a resolution. If it's OK with you, I'd like to go ahead and close it.

stolk commented 1 year ago

Out of scope.