Open jodavies opened 1 week ago
Perhaps (lib)zstd(-dev)
can be installed easily by the distribution's package manager (or possibly via tools like vcpkg). So, we can handle it similarly to gmp
/mpfr
/zlib
.
For zlibWrapper, we could use a submodule, which would then be copied into the tarball distribution.
Example: https://github.com/tueda/form/tree/feat-zstd-1 (but I don't have a suitable benchmark test for this).
Yes, that looks good. At least for me, using the #define
before the header include to enable zstd doesn't work properly. But we can call ZWRAP_useZSTDcompression(1);
(or ZWRAP_useZSTDcompression(0);
) in the code somewhere. Using this we can just add a sub-key like On compress, zstd;
.
I would say that if compiled with zstd support, use of zstd should be the default. But then one can specify On compress, gzip;
for easy benchmarking.
Do you want to implement the On/Off statement also, or I can do it and make a PR into your branch?
Please go ahead. On compress, zstd
sounds good. My implementation is experimental, so feel free to use any part of it in your branch if you find it useful.
Looking through the code for where I need to add things, I noticed that there is https://www.nikhef.nl/~form/maindir/documentation/reference/online/online.html#SECTION008220000000000000000 . Either this needs to also have the zstd option, or we can remove this form FORM 5.
Presumably "4.3.2" will not have zstd support, right? That one really should be bug-fixes only on top of 4.3.1.
Either this needs to also have the zstd option, or we can remove this form FORM 5.
The manual doesn't mention Compress gzip
, though it works while Compress zstd
gives
test.frm Line 1 --> Unknown option: zstd, on, off or gzip expected
Maybe we don't need to touch the Compress
statement, which is only for backward compatibility.
If we really want to remove such obsolete commands, perhaps it is good to first introduce deprecated warnings for them.
Presumably "4.3.2" will not have zstd support, right? That one really should be bug-fixes only on top of 4.3.1.
Right. I think it's good to keep the 4.3 branch limited to bug fixes (or at most "behaviour changes" for defect resolution), so the additional support for zstd should be excluded.
Either this needs to also have the zstd option, or we can remove this form FORM 5.
The manual doesn't mention
Compress gzip
, though it works whileCompress zstd
givestest.frm Line 1 --> Unknown option: zstd, on, off or gzip expected
Maybe we don't need to touch the
Compress
statement, which is only for backward compatibility.If we really want to remove such obsolete commands, perhaps it is good to first introduce deprecated warnings for them.
Presumably "4.3.2" will not have zstd support, right? That one really should be bug-fixes only on top of 4.3.1.
Right. I think it's good to keep the 4.3 branch limited to bug fixes (or at most "behaviour changes" for defect resolution), so the additional support for zstd should be excluded.
OK, we can add the zstd support to the Compress
statement here also. The error message is a little confusing actually: on first glance I wondered how CoCompress already knew about the possibility of a "zstd" option...
I ran a few tests, using @magv 's suggestion of using https://github.com/facebook/zstd/tree/dev/zlibWrapper to use Zstd while making almost no changes to FORM's code. You need to include the wrapper's header instead of
zlib.h
, and call a function to enable the use of zstd.For the test I link against a library as compiled by https://github.com/magv/hepware . We could do this, or alternatively compile zstd in with FORM (by setting it up as a git submodule or by directly including the code in this repository -- it is 3-clause BSD, I think that is allowed).
My first test was a simple program which generates lots of terms in a single module which all cancel in the final sort; no scratch files are created, only sort files. This runs a few percent faster with zstd, and uses a few percent less disk space.
I also ran a mincer test (an N=11 moment of a single diagram for "graviton-exchange DIS" -- this is an old benchmark I have lying around from Andreas). This uses ~12GB of disk (mostly the uncompressed scratch files) but runs 8% faster.
All these tests ran in tmpfs, so disk performance was never a bottleneck. On a slow disk the better compression ratio (say, 5%?) will help as well.
This seems like an easy improvement -- we just need to decide what the best way to include it in the build is. Any thoughts?