phac-nml / sistr_cmd

SISTR (Salmonella In Silico Typing Resource) command-line tool
Apache License 2.0
24 stars 9 forks source link

No reference gnome for SISTR hIT #45

Closed vappiah closed 3 years ago

vappiah commented 3 years ago

Hi All,

I have some salmonella enterica typhi isolates. I used sistr to subtype my salmonella isolates and the cgmist_genome_match was given as Legon (SRR1968150). I would like to perform a reference guided scaffolding annotation but unfortunately I could not find any sequence file for the genome_match . The only data available is the one in SRA. I will be grateful if you could advice on what I can do to perform these tasks. Thanks

jrober84 commented 3 years ago

Hello,

SRR1968150 is labeled as serotype Fanti not legon in the sistr database and the record in NCBI has the same identification, so you might have meant a different sample. The SISTR database includes data which we just assembled from the SRA and there isn't necessarily high-quality annotations for the genome. You can assemble the data from the SRA in the same way you did for your Typhi samples. Here is the fasta for that assembled sample if it is useful for you. I would be cautious interpreting the closest cgMLST match if it is below 200 alleles as that is really not a strong indicator of serotype below that level. SRR1968150.fasta.zip

vappiah commented 3 years ago

Hi @jrober84

Please find attached the output.tab of sistr. I would be happy to get your interpretation on it. Thanks

sistr-output2.pdf

jrober84 commented 3 years ago

Based on the identified biomarkers in this sample, SISTR thinks this matches Legon. However, there are only 111 perfect matching cgMLST alleles to this sample, so SISTR is telling you to not trust this result without confirmation. The closest cgMLST matching genome is not a Legon but a Fanti but at that level of similarity it is not suitable to draw any conclusions other than it is definitely a divergent sample from what is in the database.

None of the identified biomarkers are matching Typhi, so if you are expecting it to be based on traditional typing then, there may have been some sort of sample mix up. There are some serovars which can get muddled but the antigen biomarkers here are very clearly different. This sample could potentially be a legon but it really should be tested with traditional phenotypic serotyping to confirm.

vappiah commented 3 years ago

Thanks for the info @jrober84