DessimozLab / read2tree

a tool for inferring species tree from sequencing reads
MIT License
138 stars 18 forks source link

QUESTION: Hetero sites. #53

Open mudymudy opened 7 months ago

mudymudy commented 7 months ago

Hi there,

I just have a question regarding on how the aligner will deal with sites where there are just two reads spawning the reference marker gene with some level of contamination. In other words, I'm not sure how the alignment proceeds when there are two different bases (heterogeneity) in the same position. Will the aligner pick up a random one or ignore the site and add a ?/N sign? The same question can also cover similar situations, such in the cases where there are 3 reads and one of them contains contaminated bases or just sequencing errors. I guess the best way will be to set up a higher coverage threshold but if that's not possible then I would like to know how the aligner behaves on this cases.

Any help or guidance will be much appreciated!