aldanor / hdf5-rust

HDF5 for Rust
https://docs.rs/hdf5
Apache License 2.0
308 stars 82 forks source link

Blosc Compression-Performance Tips #231

Open dwerner95 opened 1 year ago

dwerner95 commented 1 year ago

Hey All,

I've been playing around with the blosc compression implemented in 0.8.0 and i have some questions regarding the performance.

So basically, for my 400 MB csv-file that i convert to a HDF5 file i see compression of ~84%, which is amazing, however the performance doesn't seem to be affected at all. Looking at this graph of the official HDF5 website: graph of performance over size

I should see an enormous boost in the throughput. Looking a bit at my CPU usage, the program is only using a single thread, despite me setting the number of blosc threads with blosc_set_nthreads Additionally, blosc_get_nthreads only returns one single thread, which makes me think if there is an additional flag that needs to be set?

Overall, i wished there was some kind of performance-guide on this topic, is that something that would be possible to include in the documentation?

Best wishes, Dominik

mulimoen commented 1 year ago

Performance tuning is difficult and what works for one dataset might not work for another. Is your data similar to the one used for the graph?

I'll have a look at the blosc bug when I am back at a computer.

dwerner95 commented 1 year ago

All my datasets are identical. Each HDF5 file consists of at least 3 Datasets, one 1D ndarray and two 2D ndarrays with additional 1D ndarrays. All of them have the same size in the first dimension. Not entirely sure what data they plot in the image, but i would imagine that arrays are the easiest to compress (?).

I found an issue in the h5py github about this. It seems like that even if the number of threads is set, the program chooses to use serial compression if the chunk-size is insufficiently large. However, even if i set the chunk size to the size of the array i still don't see any improvement.

mulimoen commented 4 months ago

I can trace it back to https://github.com/Blosc/c-blosc/blob/d306135aaf378ade04cd4d149058c29036335758/blosc/blosc.c#L913. One can force a block size by calling e.g. blosc_sys::blosc_set_blocksize(256) which enables parallel compression. (no idea is such a small block size makes sense, it should likely be much, much larger)