DaehwanKimLab / hisat2

Graph-based alignment (Hierarchical Graph FM index)
GNU General Public License v3.0
464 stars 112 forks source link

Feature request: Add support for xz and zstd #412

Open outpaddling opened 1 year ago

outpaddling commented 1 year ago

xz offers far better compression than gzip (up to 40% for many sequence files) and decompression is about as fast. Compression is slow, so I only recommend it for long-term files. E.g., I use xz for raw data and final results.

zstd offers compression similar to gzip while using far less CPU time. I recommend this for temporary files.

FWIW, lz4 is even faster than zstd, but does not provide comparable compression. I'm not sure it's worth supporting, but something to consider. zstd CPU time is pretty low, so there isn't much advantage to improving on it in general.