linsalrob / fastq-pair

Match up paired end fastq files quickly and efficiently.
https://edwards.flinders.edu.au/sorting-and-paring-fastq-files/
MIT License
142 stars 32 forks source link

Pair mismatched when read name is close #2

Closed Fatmice closed 7 years ago

Fatmice commented 7 years ago

There is a read in R2 named NS500207:121:HTFVJAFXX:1:11111:13010:1958 Later there is another named NS500207:121:HTFVJAFXX:1:11111:13010:19581

R1 only has a read named NS500207:121:HTFVJAFXX:1:11111:13010:1958

This tool will mistakenly write into R1.paired.fq NS500207:121:HTFVJAFXX:1:11111:13010:1958 a second time thinking that such a pair existed.

This leads to mismatches in tools that require absolute name matching for read merging, such as usearch.

linsalrob commented 7 years ago

I have corrected the way that R1/R2 are detected and that will solve this problem. I have also added a new test data set.

Fatmice commented 7 years ago

Thank you. I've tested on my own data set and seems good.