ohlab / SMEG

Strain-level Metagenomic Estimation of Growth rate (SMEG) measures growth rates of microbial strains from complex metagenomic dataset
17 stars 6 forks source link

Get number of unique SNPs identified for each strains #7

Closed ShriramHPatel closed 4 years ago

ShriramHPatel commented 4 years ago

Hi,

I was able to generate reference based SMEG database utilizing -e flag in smeg build_species command. I have a few related doubts.

A) Is it possible to get number of unique SNPs (and position) identified for each strains after generating species database.

B) And in the next step of growth estimation, we are suppose to provide list of strains (as -g flag) for which we want to measure growth rate from metagenomic samples. In that case, whether number of SNPs used for growth rate measurements (using -m 1 flag) differ based on the strains included in the -g arguments or it will always be the same and unique for a particular strain in the database irrespective of the strains included in the -g arguments? To make it more clear, in a case when I have included 1 strain Vs 2 strains as a list to -g argument, would number of SNPs used for replication rate measurement will differ or will it be relative to the strains included in the -g arguments?

Please let me know if i am not clear enough or my understanding is not correct.

Regards,

Shriram

aemiol commented 4 years ago

A. No. Because Unique SNPs are determined only from the strains supplied using the -g flag. So the SNPs would be dependent on the strains of interest

ShriramHPatel commented 4 years ago

Thanks for your swift reply.

So in a scenario where strain of interest is just a single strain, would you recommend going with reference based estimation or denovo estimation?

because in reference based estimation with just one strain passing in -g flag, I have noticed very high number of SNPs used as non-zero SNP sites for growth rate estimation (in resulting file from growth estimation).

aemiol commented 4 years ago

If you pass a single strain with the -g flag, every nucleotide in the core genes will be used to estimate growth rate

ShriramHPatel commented 4 years ago

thanks again for your helpful reply.