Strain tracking: identifying rare SNPs that discriminate individual strains

I'm curious about step 1 of the strain tracking process and in particular, "Identify[ing] SNPs (particular nucleotide at a genomic site) that rarely occur in different unrelated samples". Does this mean that you input unrelated individuals (e.g., no twins, no siblings) or unrelated samples (e.g., no longitudinal samples) or both to increase the probability of identifying rare variants? In specific, in the case of mother-infant strain tracking, would you for example only put unrelated mother samples in step 1 of the process and then include the mothers and their infants in step 2? Or would you include a cross-sectional subset of mothers and infants in step 1 and then all samples in step 2? Thank you

snayfach / MIDAS

Strain tracking: identifying rare SNPs that discriminate individual strains #105