lh3 / seqtk

Toolkit for processing sequences in FASTA/Q formats
MIT License
1.37k stars 308 forks source link

seqtk sample runs forever if input file does not exist #108

Closed lowandrew closed 6 years ago

lowandrew commented 6 years ago

Hello,

I've found what I assume to be unintended behavior in the seqtk sample command - if I give seqtk sample the name of a file that doesn't actually exist through a typo or a previous step in a pipeline failing, seqtk will use a full CPU until I kill the process, instead of failing by telling me that my input file does not exist as I would expect.

Example: seqtk sample nonexistent_fastq_file.fastq.gz 1000 > subsample_fastq.fastq.gz

This is with seqtk version 1.2-r94.

Thanks!

lh3 commented 6 years ago

Could you try the latest version from github? I think this issue has been addressed at some point. At least on my laptop:

./seqtk sample no-file.fq 10

gives an error

[E::stk_sample] failed to open the input file/stream.
lowandrew commented 6 years ago

The latest github version (1.2-r102-dirty) does quit if the file doesn't exist. I had installed via conda, so it looks like they need to update. Thanks!

lh3 commented 6 years ago

Ah, it is actually my responsibility to tag a new release. The common practice in bioconda (and in general) is to use a tagged release, not some arbitrary git checkout. I will do that later...