NCAR / VAPOR

VAPOR is the Visualization and Analysis Platform for Ocean, Atmosphere, and Solar Researchers
https://www.vapor.ucar.edu/
BSD 3-Clause "New" or "Revised" License
178 stars 49 forks source link

Improve storage of masks for missing values in VDC #945

Open clyne opened 5 years ago

clyne commented 5 years ago

The boolean masks that are used to represent missing data values in the VDC are stored one byte per grid node, resulting in a significant storage & I/O overhead. This could be reduced by 8x by simply encoding the arrays as bit masks (one bit per node). Further reduction might be possible by exploiting lossless compression.

shaomeng commented 5 years ago

I have implemented a bit manipulation class for future use of the SPECK encoder. I can see it also being used here!

shaomeng commented 5 years ago

And in terms of lossless compression, run length encoding fits here very well

shaomeng commented 7 months ago

Update: the Bitmask class in SPERR is pretty optimized performance wise, and can be pulled here to replace whatever is currently in use.