hisat2_read_statistics.py fails to infer FASTQ input

in HISAT2 v2.2.0 the hisat2_read_statistics.py infers the input read file format from the filename extension: after removing possible compression extension, it checks either "fq" of "fastq" and, if not present, it switches to read lines as they were in FASTA format. I think it should get the input format from calling scripts instead of inferring it, since it can cause downstream scripts to fail (f.i. the hisat2-align-s) because of wrong read statistics, as it occurred to me.

Note that some read preprocessing tools, say Trimmomatic, output filenames different from *.fq.gz or alike (f.i. SRR445016_1.fq.P.qtrim.gz) and you may give those files to HISAT2. However,

in the hisat2 help no costraint is specified for read filenames,
hisat default is to read fastq (-q option), so why hisat2_read_statistics.py guesses the file format ? and, most importantly,
no error nor warning is given from hisat2_read_statistics.py, so you end up with a unexplainable "--read-lengths arg must be at least 20" error when the hisat2-align-s executes.

Hope it helps

DaehwanKimLab / hisat2

hisat2_read_statistics.py fails to infer FASTQ input #255