georust / netcdf

High-level netCDF bindings for Rust
Apache License 2.0
81 stars 28 forks source link

Enable shuffle filter #100

Closed richli closed 1 year ago

richli commented 1 year ago

The nc_def_var_deflate() function has an argument to enable the HDF5 shuffle filter, which sometimes help improve compression ratios.

This function is called by the compression() method for a VariableMut and currently, the shuffle value is hard-coded to false. Could this setting be exposed? For example:

pub fn compression(&mut self, deflate_level: nc_type, shuffle: bool) -> Result<()> {}
magnusuMET commented 1 year ago

That sounds like a great idea. Would you like to open a PR implementing this?

lnicola commented 1 year ago

If we're (well, you're) breaking the API anyway, should this be something like .compression(Compression::deflate().level(6).shuffle(false))??

It looks like other compression methods are available (like blosc), but not supported in the bindings. My proposed API is not the easiest to implement, but please ignore that. My point is that shuffling seems (?) to be compression-dependent, so a future compression method might not offer shuffling.

magnusuMET commented 1 year ago

That could be a good way to do it. I would like to revamp the API as netcdf-c has gotten multiple filters now, maybe we should have a VariableBuilder?

richli commented 1 year ago

Yes, I'd be open to submitting a PR for the simple change, but I don't have time right now for something more involved like the API for a VariableBuilder. But I agree that's the way to go!

magnusuMET commented 1 year ago

I am OK with breaking changes, there is quite a lot of unreleased code since v0.7.0 which requires more substantial updates for the library users. A quick and dirty fix by making compression take the shuffle is more than enough for now