shenwei356 / taxonkit

A Practical and Efficient NCBI Taxonomy Toolkit, also supports creating NCBI-style taxdump files for custom taxonomies like GTDB/ICTV
https://bioinf.shenwei.me/taxonkit
MIT License
361 stars 29 forks source link

Why require delnodes.dmp and merged.dmp? #27

Closed nick-youngblut closed 4 years ago

nick-youngblut commented 4 years ago

I created a tool for converting the Genomes Taxonomy DataBase (GTDB) taxonomy to nodes.dmp & names.dmp files. The tool is gtdb_to_taxdump. The output works with taxonkit<0.5.0, but fails for taxonkit>=0.5.0, since you now require delnodes.dmp and merged.dmp. Why do you require these files? Could an option be included to override that requirement or is it absolutely necessary for taxonkit 0.5 to work?

shenwei356 commented 4 years ago

Because we'd like to detect merged or deleted taxids, see #19 .

I can make it optional. You can also try again by creating empty delnodes.dmp merged.dmp .

nick-youngblut commented 4 years ago

Thanks for the quick reply! I tried just creating empty delnodes.dmp and merged.dmp files, but I then get a "file empty" error.

shenwei356 commented 4 years ago

try adding a line staring with #

nick-youngblut commented 4 years ago

That works as long as the file contains #\n. Thanks!

shenwei356 commented 4 years ago

fixed: