Closed ohickl closed 1 year ago
It will work (although not tested) but any read with matches on NCBI and GTDB databases will get the LCA to the root node, so not very useful in practice. Do you have any idea what would be your expected outcome?
Thanks for the quick reply! I wanted to do some kingdom level sorting. But since the eukaryotes are massive and RefSeq only represents a very small (imo skewed) subset, i was planning to build subset databases from the GenBank eukaryotes that individually will fit in ~ 2 tb of memory. This i would want to pair with the latest 207 GTDB release for the prokaryotes, with which i had pretty good results (at least with kraken2). As there are some ways to translate GTDB into NCBI, this might be the way to go then? Did not try it tough.
It would be awesome if ganon could translate it automatically. Not sure how feasible that would be using multitax, by setting a preferred taxonomic system and if there are any pitfalls.
Indeed, would be good to have and it's already a planned feature for multitax which will be ported for ganon report
but not yet implemented. GTDB to NCBI conversion is quite straightforward. Alternatively, you could build the 207 GTDB genomes with the NCBI taxonomy, so you don't have to translate it at the end.
I will give that a try. Thanks!
This may help:
taxid.map
file mapping genome assembly accessions to TaxIds, please follow
Merging the GTDB taxonomy (for prokaryotic genomes from GTDB) and NCBI taxonomy (for genomes from NCBI).Thanks for the tips @shenwei356!
Hi, is it possible to mix taxonomy systems when using multiple databases? E.g. id be interested in using GenBank and the latest GTDB release.
Best
Oskar