snayfach / MIDAS

An integrated pipeline for estimating strain-level genomic variation from metagenomic data
http://dx.doi.org/10.1101/gr.201863.115
GNU General Public License v3.0
119 stars 52 forks source link

Strain tarcking and "marker alleles" #98

Open VadimDu opened 5 years ago

VadimDu commented 5 years ago

Dear developers/users,

I have a question regarding the strain_tracking script and concept: From what I understand from the documentation and from the paper, the 1st step is to identify rare SNPs that discriminate strains from unrelated individuals and then (2nd step) to use these rare SNPs to track strains between samples of related individuals.

I run the script on my data as described but I am having a hard time of interpreting the output. In the output file (species_id.marker_sharing) is the "count_both" column is number of shared marker alleles? In the paper you defined a transmission event as >5% marker alleles shared between pair of samples, how can I calculate this for my data based on the script output? e.g I have 13 in "count_both" column for a species, from 2 related samples taken just 2-weeks apart, is it not too low value? (this species have very high coverage)

I will appreciate your insight on this

P.S. thanks for the great software and paper! Best regards Vadim Dubinsky

dadahan commented 4 years ago

Bumping this question.