Open ax3l opened 8 years ago
Note: I am working on LZ4 support for adios
@psychocoderHPC also has a https://github.com/Blosc/c-blosc implementation incoming that is even multi-threaded (for applications like ours that have a lot of host CPU cores around but only use one to control the GPU).
Just to document our offline communication: based on a hint by @JulianKunkel at SC16, we should investigate adding LZ4 as a transform for compression.
Single-thread performance for LZ4 fast 8 (v1.7.3) looks impressive and for GPU-only applications we could even try a threaded version on the host-side (e.g., 7/15 threads on Titan or 15/30 threads on Piz Daint).
Compared to the last benchmarks @psychocoderHPC and I did on Titan for particle-in-cell data, this >10x increase could already lead to a break-even between compression-overhead and saved I/O time.