danielpodlesny / samestr

SameStr identifies shared strains between pairs of metagenomic samples based on the similarity of SNV profiles.
GNU Affero General Public License v3.0
15 stars 3 forks source link

How SameStr works? #6

Closed 123chenshixin closed 1 year ago

123chenshixin commented 1 year ago

I have been read your publication and your work. However, I want to know why SameStr extracts MetaPhlAn2 marker sequences from reference genomes. It seems that it is not necessary for user to extract marker regions for detection of shared strains in metagenomic samples. I'm a novice in this field. I sincerely hope to get your patient reply.

danielpodlesny commented 1 year ago

Hi @123chenshixin,

You are correct: the samestr extract command has been added to additionally enable strain comparisons with sequences from isolate- or metagenome assembled genomes. samestr extract is not necessary for detecting shared strains between metagenomic samples - for this, you would follow the recommendations in the docs and use (at least) the following commands in that order:

  1. samestr convert
  2. samestr merge
  3. samestr filter
  4. samestr compare
  5. samestr summarize

Make sure to set up your marker database with samestr db beforehand, and follow the instructions for compatibility with metaphlan 3 and 4.

123chenshixin commented 1 year ago

Thank you for your helpful reply!