Open codemeleon opened 4 years ago
Hi @codemeleon ,
What did you use as input to snpit?
Genomes constructed by mapping PacBio sequencing data to the Mtb reference H37Rv using SMRT Link v 2.3.0.140936
So your input was a fasta file?
@philipwfowler how robust is the fasta methodology? I have only really tested the VCF input method.
Yes
Hi @codemeleon sorry for the slow response, could be a few reasons for this - if you can share one of the fastas which caused a problem I'd be happy to take a look.
Hi,
We have constructed genomes for our 18 Mtb samples, by mapping PacBio sequencing data to the Mtb reference H37Rv using SMRT Link v 2.3.0.140936. Lineages of these samples have been experimentally validated. We have three samples from lineage-1, three samples from Lineage-2 and rest twelve samples from lineage-4.
We have some additional samples whose experimental validation of lineages is unknown. We decided to use snpit on our experimentally validated samples before reporting lineages of new samples. snpit predicted as fifteen samples belonging to Lineage-4, one sample to lineage-3 and for the remaining two samples it couldn’t report. Maximum likelihood phylogeny based on whole genome alignment shows experimentally validated lineage specific samples clustering.
I do not know the cause of discrepancies between experimentally validated and snpit predicted results. Please give me advice, what I might be doing wrong.
Thank you.