snayfach / MIDAS

An integrated pipeline for estimating strain-level genomic variation from metagenomic data
http://dx.doi.org/10.1101/gr.201863.115
GNU General Public License v3.0
124 stars 52 forks source link

Strain tracking: identifying rare SNPs that discriminate individual strains #105

Open dadahan opened 5 years ago

dadahan commented 5 years ago

I'm curious about step 1 of the strain tracking process and in particular, "Identify[ing] SNPs (particular nucleotide at a genomic site) that rarely occur in different unrelated samples". Does this mean that you input unrelated individuals (e.g., no twins, no siblings) or unrelated samples (e.g., no longitudinal samples) or both to increase the probability of identifying rare variants? In specific, in the case of mother-infant strain tracking, would you for example only put unrelated mother samples in step 1 of the process and then include the mothers and their infants in step 2? Or would you include a cross-sectional subset of mothers and infants in step 1 and then all samples in step 2? Thank you