uzh-rpg / DSEC

MIT License
128 stars 16 forks source link

HDF5 packager #18

Closed fedepare closed 3 years ago

fedepare commented 3 years ago

Hi @magehrig,

The reason for this issue is that I would like to ask you whether you are planning to share the HDF5 packager (or details of it) that you used for this dataset. I've been playing around with it a little bit, and if I try to encode event data from your sequences using int16 for location, bool for polarity, and float64 for timestamps, the files becomes 4x bigger than yours. I also tried playing around with some other variable types, but never got close to your file size. This makes me think that there is a substantial difference between our packagers, as yours is significantly more efficient in terms of memory.

Thanks making the dataset publicly available and congrats on your recent work!

Best, Federico Paredes-Valles

magehrig commented 3 years ago

Hi Federico

The difference is probably that I am using zstd compression (lzhc is also great). I am not planning to share that code openly as it is not in a state in which that makes sense. But I can share it privately with you when I am back in the office next week. Just drop me an email if you are interested.

Cheers

fedepare commented 3 years ago

Hi Mathias,

As you suggested, zstd compression for HDF5 makes a big difference in terms of the resulting file size. Thanks for letting me know about this!

Cheers, Fede