envmetagen / metabinkit

Set of programs to perform taxonomic binning.
GNU General Public License v3.0
2 stars 1 forks source link

blacklisting if NCBI taxids not being used #17

Open bastianegeter opened 4 years ago

bastianegeter commented 4 years ago

Current get children from taxids approach will not work if people are using their own taxonomies (by providing the KPCOFGS columns)

But, by the time metabin reaches the blacklisting step, it will always have the KPCOFGS columns.

Perhaps it would be better to use names rather than taxids for this blacklisting (and they are possibly easier for user to provide). Downside: ambiguities with user-defined taxon names vs NCBI taxon names. Very difficult. So perhaps we need both options...

bastianegeter commented 4 years ago

As briefly discussed, one way to overcome this (and other related issues of not having NCBI taxids available) would be to somehow get custom taxids into the taxonomy dump database. Just a few ideas we could follow up:

https://github.com/guyleonard/taxdump_edit