Open eschnett opened 8 months ago
Thanks for opening an issue and for sharing your work. It's exciting to see another asdf implementation!
There has been some recent discussion about adding zstd
support to asdf (see PR: https://github.com/asdf-format/asdf/pull/1570). As the python asdf now supports adding compression algorithms via extensions (see this example adding zstd
support: https://github.com/braingram/asdf-zstd) we'd like to soon create a new asdf-compressors package that adds a number of compression algorithms (see the roadmap for a mention of this plan: https://github.com/asdf-format/asdf/wiki/Roadmap#changes-not-tied-to-a-particular-version). It would be great to coordinate this with asdf-cxx to make sure the labels match and features are compatible.
I will give asdf-cxx a closer look. Have you done much testing with files written by asdf-cxx and read by the python (or IDL) implmentation of asdf (and vice versa)? It would be great to hear more about asdf-cxx and your impressions of asdf.
FYI: I ran your demo-compression example (thanks for providing that with your code!). I had to slightly modify it to not attempt to save using blosc2 (I didn't immediately find it on homebrew). The file it generated was readable in python with the new modifications to the asdf-compression package (this is a work-in-progress and I hope to move it to the asdf-format organization soon). I opened an issue to track some compatibility tests (there is one other package that has already added some form of blosc support via an extension): https://github.com/braingram/asdf-compression/issues/3
I think blosc2
is not available from Homebrew, Debian, etc. The main difference between blosc2
and blosc
is that the former supports uncompressed data sizes larger than 2 GByte. For the time being just using blosc
would be good enough.
I am now adding support for liblz4
as compressor to follow suit. I think you're using lz4f
as token.
CHORD is a radio telescope in Canada https://www.chord-observatory.ca that's currently being constructed. We are considering / experimenting with file formats for various data products, and ASDF looks interesting because it is (a) simple and (b) can be efficiently streamed.
In the past, compression algorithms very similar to Blosc's https://www.blosc.org/pages/ "bitshuffle" have proven very useful. I wonder whether these could be added to the standard.
I have, as experiment, added support for
c-blosc
,c-blosc2
, andzstd
to https://github.com/eschnett/asdf-cxx . I wonder whether you are in principle interested in augmenting the standard, usingblsc
,bls2
, andzstd
as compression strings.