nick-youngblut / gtdb_to_taxdump

Convert GTDB taxonomy to NCBI taxdump format
MIT License
65 stars 13 forks source link

KeyError: 'Cannot find "ncbi_taxonomy"' #21

Closed jiaojiaoguan closed 1 year ago

jiaojiaoguan commented 1 year ago

Dear authors:

Thanks for your excellent work! I use the GTDB toolkit to classify many bins files. I want to convert GTDB taxonomy to ncbi taxonomy. And I use this script:

python ncbi-gtdb_map.py -q gtdb_taxonomy gtdb_taxonomy_file.csv gtdbtk.bac120.summary.tsv gtdbtk.ar53.summary.tsv

The 'gtdb_taxonomy_file.csv' is from the the second column of "gtdbtk.bac120.summary.tsv" and " gtdbtk.ar53.summary.tsv"

But I got a error: Traceback (most recent call last): File "/home/caifawen/miniconda3/envs/gtdbtk/bin/ncbi-gtdb_map.py", line 629, in <module> main(args) File "/home/caifawen/miniconda3/envs/gtdbtk/bin/ncbi-gtdb_map.py", line 608, in main no_prefix=args.no_prefix) File "/home/caifawen/miniconda3/envs/gtdbtk/bin/ncbi-gtdb_map.py", line 294, in load_gtdb_metadata raise KeyError('Cannot find "ncbi_taxonomy"') KeyError: 'Cannot find "ncbi_taxonomy"'

Thanks for your help! gtdb_taxonomy_file.csv

jiaojiaoguan commented 1 year ago

When I run the example you give, I download three files you provided, there is the same error! ` (gtdbtk) [xxx@login wild]$ ncbi-gtdb_map.py -q gtdb_taxonomy /home/xxx/wild/gtdb_tax_queries.txt /home/xxx/wild_xxx/bac120_metadata_r95.tsv /home/xxx/wild/ar122_taxonomy_r202.tsv -o ./ 2023-07-17 20:16:23,488 - Loading: /home/xxx/wild/bac120_metadata_r95.tsv 2023-07-17 20:16:34,283 - Entries lacking an NCBI taxonomy: 0 2023-07-17 20:16:34,283 - Completeness-filtered entries: 17 2023-07-17 20:16:34,283 - Contamination-filtered entries: 2253 2023-07-17 20:16:34,283 - Entries used: 189257 2023-07-17 20:16:34,283 - Loading: /home/xxx/wild/ar122_taxonomy_r202.tsv Traceback (most recent call last): File "/home/xxx/miniconda3/envs/gtdbtk/bin/ncbi-gtdb_map.py", line 292, in load_gtdb_metadata X = line[header['ncbi_taxonomy']] KeyError: 'ncbi_taxonomy'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/xxx/miniconda3/envs/gtdbtk/bin/ncbi-gtdb_map.py", line 629, in main(args) File "/home/xxx/miniconda3/envs/gtdbtk/bin/ncbi-gtdb_map.py", line 608, in main no_prefix=args.no_prefix) File "/home/xxx/miniconda3/envs/gtdbtk/bin/ncbi-gtdb_map.py", line 294, in load_gtdb_metadata raise KeyError('Cannot find "ncbi_taxonomy"') KeyError: 'Cannot find "ncbi_taxonomy"' `

Thanks for your help!

nick-youngblut commented 1 year ago

The 'gtdb_taxonomy_file.csv' is from the the second column of "gtdbtk.bac120.summary.tsv" and " gtdbtk.ar53.summary.tsv", But I got a error:

You need to include a header line with ncbi_taxonomy

nick-youngblut commented 1 year ago

I've updated the error messages to hopefully make it easier to understand the problem. Re-open the issue if you still have problems/questions.