Open rpalcab opened 2 years ago
In my experience, k-mer size of 17 (-k 17) and sketch size of 50000 (-s 50000) is enough for differentiating Salmonella serovars. The default sketch size of just 1000 certainly doesn't provide enough resolution for subspecies etc.
Hello,
I'm currently working on Mycobacterium caprae and Mycobacterium bovis. These subspecies of the M. tuberculosis complex are phylogenetically very similar, so the task of identifying them is not always trivial.
In one of my analysis, I expected all the samples to be M. caprae, but when looking at the Mash screen results I find that many of them could be assigned to both subspecies, since they got the same shared-hashes score and p-value, or just a difference of 1 in the shared-hashes score.
Sample A
Sample B
This makes me wonder whether Mash screen is able to identify in a subspecies level. Also, is a difference of 1 in the shared-hashes score robust enough to determine the taxonomy of an organism?
Thanks in advance