phac-nml / mob-suite

MOB-suite: Software tools for clustering, reconstruction and typing of plasmids from draft assemblies
Apache License 2.0
124 stars 33 forks source link

Understanding no secondary cluster in output reports #174

Open cizydorczyk opened 4 weeks ago

cizydorczyk commented 4 weeks ago

When running mob-recon with a custom plasmid database, I often get no secondary cluster reported for contigs that are assigned to a primary cluster. In the database (constructed using ~50K plasmids), all plasmid sequences are assigned both a primary and secondary cluster.

This makes me think that this is not an issue with my database.

So, when a contig has only a primary cluster designation, does this mean it is insufficiently similar to any secondary clusters to belong to those clusters? What if I want to know whether two such plasmids are similar to each other (e.g., a new secondary cluster group absent in my database)?

Does mob-recon handle new secondary clusters, assigning them new IDs etc.? Or does it simply ignore them and would require a better db to incorporate them?

Any help in understanding is greatly appreciated.

Thank you, Conrad