aldanor / hdf5-rust

HDF5 for Rust
https://docs.rs/hdf5
Apache License 2.0
297 stars 79 forks source link

Compression is likely not working. #288

Open CiaranWelsh opened 1 month ago

CiaranWelsh commented 1 month ago

I've been playing around with the compression algorithms in hdf5 and observed that the Blosc compression algorithms basically do nothing. I suspect this is a bug on your end (though also possible on my end)

Here's my evaluation, hope it helps =]

compression_performance

mulimoen commented 1 month ago

See #273 for a potential duplicate

CiaranWelsh commented 1 month ago

Yes looks like is a duplicate, thanks for pointing me to the other issue. After building hdf5-rust like this:

// cargo.toml

hdf5 = { git = "https://github.com/aldanor/hdf5-rust.git", rev = "33c7a18155bf0ccf11dfe8412f59376619e292bc", features = ["blosc", "lzf" ] }
hdf5-types = { git = "https://github.com/aldanor/hdf5-rust.git", rev = "33c7a18155bf0ccf11dfe8412f59376619e292bc" }
blosc-src = { version = "0.3.0", features = ["lz4", "zlib", "zstd"] }

the new data looks like this (please ignore the title in the plots as they are wrong (input data

compression_algorithmspace-check-output-1MB-raw-data compression_levelspace-check-output-1MB-raw-data compression_nthreadsspace-check-output-1MB-raw-data compression_shufflespace-check-output-1MB-raw-data hdf5_chunk_sizespace-check-output-1MB-raw-data

As a follow up questions, it seems that the number of threads argument isn't doing much. I'm setting it with

        hdf5::filters::blosc_set_nthreads(nthreads);

Is there something I'm missing in order to get this working? Thanks

mulimoen commented 1 month ago

The blosc threads issue was mentioned in #231

CiaranWelsh commented 1 month ago

Great, I'll not worry about it for now then and hope a fix turns up in time. Thanks for the help!

mulimoen commented 1 month ago

You are very welcome to make a PR exposing blosc_sys::blosc_set_blocksize as hdf5::filters::blosc_set_blocksize and another adding features blosc-lz4 which enables `blosc-sys/lz4 etc.!