lh3 / seqtk

Toolkit for processing sequences in FASTA/Q formats
MIT License
1.35k stars 310 forks source link

Seqtk trimfq used on files in .gz format returns much larger files but still .gz #184

Closed bmillerlab closed 2 years ago

bmillerlab commented 2 years ago

I know that if seqtk program runs re-compression using a different gzip that changes in size may happen, but I'm getting differences like 436 Mb going to 2.2 Gb. This does not seem correct. I guess I am going to decompress and try re-compressing again using gunzip, but I wanted to check about this oddity.

Also a separate question, would it be hard to make it possible to run this using wildcards for file names or maybe input of an entire folder by calling the folder name and with output to a new folder?

Thanks.

lh3 commented 2 years ago

seqtk doesn't compress output. You need to pipe to gzip.