bokulich-lab / q2-assembly

QIIME 2 plugin for (meta)genome assembly.
BSD 3-Clause "New" or "Revised" License
4 stars 12 forks source link

ENH: make contig IDs unique across all samples #82

Closed misialq closed 4 months ago

misialq commented 6 months ago

Is your feature request related to a problem? Please describe. When binning contigs from more than one sample (see q2-moshpit), we need a way to distinguish contigs belonging to different samples (contig IDs are only unique per-sample).

Describe the solution you'd like I want the contigs to be renamed post-assembly, regardless of the assembler used. We could use UUIDs to represent those, instead of arbitrary strings, as it is now. shortuuid could be a good candidate as it can generate short IDs which are still unique enough - that way we would avoid using the same kind of ID as we already are using to represent MAGs with the added benefit of those being slightly more human-readable.

Notes: Let's provide the user with a selection of ways to rename: shortuuid, uuid4, uuid5 etc.

Acceptance criteria: