FOI-Bioinformatics / flextaxd

FlexTaxD (Flexible Taxonomy Databases) - Create, add, merge different taxonomy sources (QIIME, GTDB, NCBI and more) and create metagenomic databases (kraken2, ganon and more )
GNU General Public License v3.0
64 stars 7 forks source link

[Question] Does taxonomy exported from flextaxd only contained species with genome available in refseq folder? Can I make use of library.fna downloaded via kraken2-build? #56

Open zztin opened 1 year ago

zztin commented 1 year ago

Hi Flextaxd team,

Great tool! I am attempting building a kraken2 database with archaea and bacteria from GTDB and viral, plasmid, uniVec_Core from NCBI. I have downloaded library.fna via kraken2 command: kraken2-build --download-library viral --db $DBNAME, and I am wondering if there is a way to make use of these genomes instead of downloading again withncbi-genome-download?

Is it possible to merge the taxonomy file only without building a flextaxd database?

I am following the walkthrough -WGS and encounter this line:

As the nucl_gb accession file only contain chr/cont/scaffold - id to taxid, the script must match annotations to a local file (hence the nessesity to download genomes first)

It was not very clear to me what this indicates. If I follow the example (only download human, 4 bacterial genera, archaea), and export taxonomy by: flextaxd -db databases/NCBI_GTDB_merge.db -o taxonomy --dbprogram kraken2 --dump

Do I receive a complete taxonomy including everything else than archaea/bacteria/human? And can I use this taxonomy directly in combination with the downloaded library.fna with kraken2 interface to build a database?

Thank you very much! Please let me know if my question description is unclear.