Closed j23414 closed 2 years ago
Especially for sequence alignment files like for SARS-CoV-2 or monkeypox, where there are thousands of similar sequences with lengths >20kB, zstd is extremely good at compressing these data.
It's as good as xz
for compression ratios but much faster at compressing and uncompressing.
@j23414 I think you may have mistyped in your second example. I see you have a pacbio.fq.xz
file there but may have meant pacbio.fq.zst
?
Please use unix pipes:
xz -dc reads.fq.xz | minimap2 -ax map-pb ref.fa - > aln.sam
ah got it, thanks:
zstd -d -c reads.fq.zst | minimap2 -ax map-pb ref.fa - > aln.sam
I see that minimap2 accepts gunzip files:
Wondered if there were any plans to support
zstd
compressed files?Sorry if this is already a feature and I missed it somehow. For a different project, zstd compression benchmarks seemed very fast.
from https://github.com/nextstrain/ncov-ingest/issues/341
Just checking, no pressure, I didn't find an answer with a quick search through issues.