bacpop / PopPUNK

PopPUNK 👨‍🎤 (POPulation Partitioning Using Nucleotide Kmers)
https://www.bacpop.org/poppunk
Apache License 2.0
88 stars 18 forks source link

Give clearer results when queries are in the reference DB #190

Closed johnlees closed 2 years ago

johnlees commented 2 years ago

When using assign mode, if query names match those in the reference database they will simply be omitted from the cluster output. It would be better to do one of the following:

I'd suggest 1 + 2

nickjcroucher commented 2 years ago

Would we need to add a suffix to ensure names remained unique? I would probably err towards three as a default, with 1+2 possible with a flag, but I appreciate that's just more command line complexity. Maybe 1+2 for a standard query, but 3 if the database is being updated?

johnlees commented 2 years ago

I also forgot about --write-references. That makes it more feasible to error, but offer an option to re-run and override.