dib-lab / charcoal

Remove contaminated contigs from genomes using k-mers and taxonomies.
Other
52 stars 1 forks source link

more explicitly separate out the classification and decontamination steps? #38

Open ctb opened 4 years ago

ctb commented 4 years ago

mentioned in #30

we could split classification from decontamination.

if we separated out classification from decontam, we could provide both lightweight (sourmash lca) and heavier (sourmash lca + GTDB-Tk) classification.

also, this would mean that just_taxonomy.py would be able to do a single pass across a genome, I think

ctb commented 4 years ago

it would also give people another tool for bin classification (FBFW...) and provide a valuable checkpoint as well as a place to override taxonomy (by either editing the file or, better, adding the different classification to provided-lineages)