phac-nml / sistr_cmd

SISTR (Salmonella In Silico Typing Resource) command-line tool
Apache License 2.0
25 stars 9 forks source link

Serotyping a genome with mixed strains of Salmonella #46

Closed vappiah closed 3 years ago

vappiah commented 3 years ago

Hello developers,

I would like to perform serotyping on my Salmonella enterica genomes. Initially I used kraken database to classify the contigs and this showed that my data contained different Salmonella enterica strains in each sample.

So in this case can I still use SISTR to do the serotyping and trust the result ? Thanks in advance.

jrober84 commented 3 years ago

Hello, Kraken isn't good at detecting "mixed" samples of Salmonella and having contigs from multiple serotypes generally doesn't mean anything. If your assembly is between 4 - 6mb then your sample is in the right range for a single Salmonella genome. If you want to check if there looks to be contamination in your sample, then I would recommend running confindr on the reads https://github.com/OLC-Bioinformatics/ConFindr .

vappiah commented 3 years ago

Thanks @jrober84. What will you recommend? I run the ConFindr before or after trimming of raw reads.

jrober84 commented 3 years ago

You can run the reads without any trimming