Closed karynkomatsu closed 1 year ago
So the MOB-cluster identifiers indicated membership to a "cluster" which consists of 1 or more members. Sequences are assigned to clusters based on the lowest mash distance in the reference database. So the mash nearest neighbor is telling you what sequence in the reference database has the lowest mash distance. The cluster associated with that closest match is what assigns your query sequence to a MOB-cluster. So if your sequences were 100% identical then their mash nearest neighbour would be the same, but if there is any differences then it is possible for the sequences to have different mash nearest neighbors.
Hi, I noticed that mobtyper_results (from MOB RECON) sometimes show rows with same primary_cluster_id, but different mash_nearest_neighbour. If the primary MOB-cluster id of two plasmid contigs are the same, why would their accession ID of closest plasmid match (aka mash_nearest_neighbour) be different? If they have same cluster id, shouldn't their closest plasmid match also be identical?
Thank you for all your help in advance!