bbuchfink / diamond

Accelerated BLAST compatible local sequence aligner.
GNU General Public License v3.0
994 stars 183 forks source link

Paired end fastq queries exit with "Error: Unequal number of sequences in paired read files." #735

Open ryandkuster opened 9 months ago

ryandkuster commented 9 months ago

When running blastx using paired end fastq-formatted reads as query input:

diamond blastx -d reference.dmnd  -o results.tsv -q R1.fastq R2.fastq

...the job fails with the message:

Error: Unequal number of sequences in paired read files.

This error occurs with any dataset, regardless of compression. A pair of test files with a single R1 and a single R2 read also produces the error. This error has been produced in conda 2.1.8 and binaries 2.1.8 and 2.1.0. The binary of 2.0.15 works with the exact same usage, so this may have occurred before the 2.1.0 release.

I've seen workarounds preparing paired-ends by merging, but this may not always be ideal if reads don't overlap well. Thank you!

bbuchfink commented 9 months ago

Ok, looks like this needs to be fixed. But note that using 2 files like this is not different from just aligning each file separately, the information of paired reads is not really used.

deyvidamgarten commented 3 months ago

Hello guys,

Same issue here even though my fastq files are properly paired. Version 2.1.9.163

Thank you