kiyo-masui / bitshuffle

Filter for improving compression of typed binary data.
Other
215 stars 76 forks source link

Single byte dtypes #22

Closed alimanfoo closed 9 years ago

alimanfoo commented 9 years ago

Apologies for the naive question, I have large arrays with int8 dtype but where most values are 0, 1 or 2. Can bitshuffle improve compression of arrays with a single-byte dtype?

kiyo-masui commented 9 years ago

Yup. Expect that data to compress to something like 25% or better.

alimanfoo commented 9 years ago

Thanks, I'm using blosc and it turns out there was a bug in the version I was using where bitshuffle wasn't applied to single byte dtypes. Using a more recent version of blosc with bitshuffle I see a significant improvement in compression with some real-world data, benchmarks in this notebook if you're interested.

kiyo-masui commented 9 years ago

Thanks for passing this along. Happy to see bitshuffle stacks up well.