tyjo / coptr

Accurate and robust inference of microbial growth dynamics from metagenomic sequencing
GNU General Public License v3.0
16 stars 5 forks source link

Paired-end reads #5

Closed Midnighter closed 3 years ago

Midnighter commented 3 years ago

Hi,

More of a discussion question. I was wondering if you can say a few more words about the reasoning for this line from the documentation, please?

For paired end sequencing, it is recommend to only map reads from a single mate-pair.

tyjo commented 3 years ago

Paired reads are not independent. Using both mate pairs does not provide much additional information for PTR estimation beyond a single read, while also being a larger computational burden during mapping.

It's worth noting that other tools take a similar approach (e.g. https://github.com/biobakery/MetaPhlAn/wiki/MetaPhlAn-3.0).

Midnighter commented 3 years ago

That's certainly intuitive with regard to PTR. I was wondering if having the pairs would help with "placing" the reads, though, when mapping against many genomes?

tyjo commented 3 years ago

I'm not sure using paired reads will help during the mapping step, because reads are being mapped to representative genomes from 95% ANI clusters. Consequently, we often aren't mapping reads to the exact strain in the sample. In this setting I'm not sure how much information from paired reads helps achieve a more accurate mapping. Insertions or deletions in a strain relative to the reference could violate assumptions that the read mapper uses to accurately place paired reads.

That being said, you can map paired reads with bowtie2 and pass them to CoPTR. CoPTR will extract one read per mate-pair for downstream steps.

Midnighter commented 3 years ago

Awesome, than you for your explanation.