torognes / vsearch

Versatile open-source tool for microbiome analysis
Other
643 stars 123 forks source link

Definition of a read pair #483

Closed Username-felix-is-not-available closed 2 years ago

Username-felix-is-not-available commented 2 years ago

Thanks a lot for developing VSEARCH! It helps me a lot with my analyses.

The answer to my question might be obvious, but I was wondering how VSEARCH decides which reads belong to a read pair and should be merged by --fastq_mergepairs. In the USEARCH documentation, I found that a read pair is defined by name and index in the FASTQ files (https://www.drive5.com/usearch/manual7/fastq_mergepairs.html). I haven't found a similar statement for VSEARCH. I did some experiments and VSEARCH appears to ignore the read name and only care about the index in the FASTQ files. Is this correct?

Have a nice day, Felix

torognes commented 2 years ago

Hi, yes, only the position/index in the FASTQ files is taken into account. The name is ignored. It will complain if there are different numbers of sequences in the two files (R1 vs R2).

Username-felix-is-not-available commented 2 years ago

Great, thank you!

frederic-mahe commented 2 years ago

manpage updated to cover that topic f57da18a7d3df82053f279ee210ba01253648681