shenwei356 / taxonkit

A Practical and Efficient NCBI Taxonomy Toolkit, also supports creating NCBI-style taxdump files for custom taxonomies like GTDB/ICTV
https://bioinf.shenwei.me/taxonkit
MIT License
361 stars 29 forks source link

`taxonkit lineage` stalls if taxID = 1 #7

Closed nick-youngblut closed 6 years ago

nick-youngblut commented 6 years ago

I'm using taxonkit v0.2.0 (installed via bioconda), and I was running taxonkit lineage on the "hits" file generated by centrifuge. taxonkit lineage would very quickly write out taxonomies for the first ~60000 hits, but then stall and the memory used would climb to >300 GB. It turns out that one of the centrifuge hits had a taxID of "1" (centifuge called this a "no rank"). I filtered out this "no rank" hits, which fixed this stalling issue.

shenwei356 commented 6 years ago

fixed in v0.2.4-dev2