kiyo-masui / bitshuffle

Filter for improving compression of typed binary data.
Other
215 stars 76 forks source link

Zstd + bitshuffle #84

Closed fleon-psi closed 2 years ago

fleon-psi commented 4 years ago

For the purpose of writing data from high performance X-ray detectors, I have recently tested replacing LZ4 with Zstd in Bitshuffle code. At least for my test data compression factors were 20% better, but performance is suffering - interestingly Zstd "likes" larger block sizes. Results are available here:

https://aca.scitation.org/doi/full/10.1063/1.5143480 https://aca.scitation.org/doi/suppl/10.1063/1.5143480/suppl_file/20200126_supporting+material.pdf --> see Table S2 for block size comparison

Changes were simple - just adding functions with call to Zstd routines instead of LZ4 + adding one constant for HDF5 plugin, see: https://github.com/fleon-psi/bitshuffle

If you think this is worth including in mainstream Bitshuffle code, I'd be happy to make pull request.

kif commented 3 years ago

Hi Filip, Have you seen my work when combining hardware gzip compression and bitshuffle https://indico.esrf.fr/indico/event/33/session/3/contribution/23/material/slides/0.pdf slide 19 Cheers, Jerome

fleon-psi commented 3 years ago

Hi Jerome,

Yes, I saw, very nice results. I wonder if you’ve used compressor units from one or two P9 chips. IBM is claiming 7 GB/s as limit for single unit: https://conferences.computer.org/isca/pdfs/ISCA2020-4QlDegUf3fKiwUXfV0KdCm/466100a001/466100a001.pdf I wonder if you were able to get way better than them with the 12 GB/s number 😊.

But again – this is different combination of bitshuffle with gzip this time and I wonder if there is plan for bitshuffle to actually support more combinations than just bshuf/LZ4, like bshuf/Zstd and bshuf/gzip – this would just require to add more parameters like BSHUF_H5_COMPRESS_LZ4 and more conditions in bshuf_h5_filter function. Otherwise even if there is a benefit of using better compression than LZ4, it will be difficult to implement if there is no simple CPU decompressor. It would be equally interesting to have different block size for bitshuffle and add-on compression (LZ4, Zstd, gzip), which would help with hardware (FPGA) streaming implementation.

Best, Filip

From: Jerome Kieffer notifications@github.com Reply to: kiyo-masui/bitshuffle reply@reply.github.com Date: Wednesday, 30 September 2020 at 22:01 To: kiyo-masui/bitshuffle bitshuffle@noreply.github.com Cc: "Leonarski Filip (PSI)" filip.leonarski@psi.ch, Author author@noreply.github.com Subject: Re: [kiyo-masui/bitshuffle] Zstd + bitshuffle (#84)

Hi Filip, Have you seen my work when combining hardware gzip compression and bitshuffle https://indico.esrf.fr/indico/event/33/session/3/contribution/23/material/slides/0.pdf slide 19 Cheers, Jerome

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/kiyo-masui/bitshuffle/issues/84#issuecomment-701613742, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGVIBJDLFIHXAI5CIPDZBYLSIOFCNANCNFSM4NQM6SDQ.

kif commented 3 years ago

I have been using 2 processor of the AC922 for those tests... At that time, the firmware we were using was beta, but apparently it has been released as it was. Flashing the firmware was the trick part. You also need a patched version of the linux kernel (or a very recent one as this was all integrated into 5.9 if I remember well)

Beside this, Blosc implements all zoology of prefilter and compressors which can be mixed together. Lz4, gzip and zstd are some of the options. This is all packaged in hdf5plugin if needed.

jrs65 commented 2 years ago

I believe this is now all merged into bitshuffle. Let me know if you have any trouble with it.