NCAR / SPERR

SPERR is a lossy scientific (floating-point) data compressor that produces one of the best rate-distortion curves.
Apache License 2.0
19 stars 10 forks source link

A few comments from Robert #140

Closed robertu94 closed 2 years ago

robertu94 commented 2 years ago

Is there any documentation for the SPERR C++ api? Is it intended to be public or private?

I develop LibPressio a compressor lossy compressor adapter library. For now, I've made a adapter for the C api for SPERR using the examples and documentation, but looking through some of the other documentation for your CLI leaves me thinking that the C++ api does a lot more. I want comparisons with your library to be as fair as possible.

I've also forked your project to make a few improvements to make it more amenable to be built via Spack a package manager that has gained popularity in the high performance computing community (and relevant here, the recommended way to install LibPressio). I'm open to better defaults if you have suggestions. I can also contribute this back if it is desired.

One specific aspect that wasn't clear was what is a reasonable default for qlevel. I saw this https://github.com/shaomeng/SPERR/wiki/CLI:-Exploring-Compression-Parameters-Using-probe_3d and https://github.com/shaomeng/SPERR/wiki/CLI:-Understanding-Quantization-Level-q but was looking for a programatic way to get this information from C/C++.

Lastly, does this code only support non-square and cube data volumes? This isn't a huge issue for me, but I couldn't use it with the hurricane dataset (500x500x100) from SDRBench with any value of qlev that I tried.

shaomeng commented 2 years ago

Hi Robert, thank you very much for your thoughtful comment! I'll clarify some of them in this thread and will open separate issues for specific requests! (Btw I changed the title of this issue; hope you're OK with it!)

Is there any documentation for the SPERR C++ api? Is it intended to be public or private?

Yes, there are C++ classes that are public and supposed to be used in C++ projects. I don't have them documented yet, but it's a good suggestion and I've noted it in issue #141.

I develop LibPressio a compressor lossy compressor adapter library. For now, I've made a adapter for the C api for SPERR using the examples and documentation, but looking through some of the other documentation for your CLI leaves me thinking that the C++ api does a lot more. I want comparisons with your library to be as fair as possible.

Good deal! Glad to have it in! I'm definitely happy to look at it together (after the C++ documentation probably) to make sure the integration is done in the best approach!

I've also forked your project to make a few improvements to make it more amenable to be built via Spack a package manager that has gained popularity in the high performance computing community (and relevant here, the recommended way to install LibPressio). I'm open to better defaults if you have suggestions. I can also contribute this back if it is desired.

That's also an awesome improvement! If you don't mind, could you submit a PR to this repo?

One specific aspect that wasn't clear was what is a reasonable default for qlevel. I saw this https://github.com/shaomeng/SPERR/wiki/CLI:-Exploring-Compression-Parameters-Using-probe_3d and https://github.com/shaomeng/SPERR/wiki/CLI:-Understanding-Quantization-Level-q but was looking for a programatic way to get this information from C/C++.

It's right that the selection of q is kinda awkward right now. The good news is, I'm also experimenting some programmatic methods and I'm fairly confident that some of these approaches would work. So I think for now it really does require trial-and-error from the user, but hopefully it'll change in the order of 2-3 months.

Lastly, does this code only support non-square and cube data volumes? This isn't a huge issue for me, but I couldn't use it with the hurricane dataset (500x500x100) from SDRBench with any value of qlev that I tried.

It should work. If you could let me know the particular variable that you tested, then I can look into it!

shaomeng commented 2 years ago

Hi @robertu94 , I've completed my work of "automatic quantization level selection," so now you only need to provide the program a desired PSNR or PWE (point-wise error) value. Do you want to give it a try?

Also, I'm wondering if there's any update on your work to integrate SPERR to LibPressio and Spack? Are there any roadblocks that I can help or things I can clarify?

Thanks!

robertu94 commented 2 years ago

I just tired this new version, and it is substantially easier to use; thanks!

W.R.T. LibPressio integration: Because SPERR is GPL licensed, and I need to maintain a BSD license for LibPressio, the integration into LibPressio is https://github.com/robertu94/libpressio-sperr which is then loadable via the 3rd party loader in the "libpressio-meta" library that way people who need to avoid the GPL and don't use SPERR can.

As for Spack integration, SPERR can be used via the libpressio cli distributed via spack like so

git clone https://github.com/robertu94/spack_packages robertu94_packages
spack repo add ./robertu94_packages
spack install libpressio-tools ^ sperr
spack load libpressio-tools ^ sperr
pressio -i ~/git/datasets/hurricane/100x500x500/CLOUDf48.bin.f32 -d 500 -d 500 -d 100 -t float -b compressor=sperr -o abs=1e-6  -o sperr:chunks=100 -o sperr:chunks=100 -o sperr:chunks=100 -m time -m size -M all

If chunks are not specified, I default to 256x256x256 which is what your example did. I added a note to the builtin documentation that this value should be tuned to match some factor of the dataset and effects runtime and compression ratio.

clyne commented 2 years ago

@shaomeng are you sure you want a GPL license for SPERR? You might want to consult NCAR's recommendations for open source licenses here.