jenniferlu717 / KrakenTools

KrakenTools provides individual scripts to analyze Kraken/Kraken2/Bracken/KrakenUniq output files
GNU General Public License v3.0
320 stars 90 forks source link

KeyError: '3045' from make_kreport.py #41

Open djbradshaw2 opened 3 years ago

djbradshaw2 commented 3 years ago

Thank you very much for your powerful tools!

I ran into the following error while running the make_kreport.py script.

python make_kreport.py -i P1_S7_L001_R_kraken2.txt -t nt_ktaxonomy -o P1

PROGRAM START TIME: 08-31-2021 16:43:17

STEP 1/4: Reading taxonomy nt_ktaxonomy... 2083898 nodes saved STEP 2/4: Reading kraken file P1_S7_L001_R_kraken2.txt... 2.084 million reads processed STEP 3/4: Creating final tree... Traceback (most recent call last): File "/home/microbiology/KrakenTools/make_kreport.py", line 199, in main() File "/home/microbiology/KrakenTools/make_kreport.py", line 145, in main p_node = taxid2node[curr_tid].parent KeyError: '3045'

I ran Kraken2 with the following basic script against the full nt database.

kraken2 --db $kraken2_db P1_S7_L001_R1_kneaddata.fastq --report P1_S7_L001_R_kraken2.txt --report-zero-counts

Thank you very much for your time and help.

Sincerely,

David Bradshaw

sentausa commented 2 years ago

Hi. I had a similar problem and I believe I found the cause (and hopefully also the solution). If you already solved it, then it'll be for other users who face the same problem. '3045' is a taxid and the error is caused by its absent from the nt_ktaxonomy file while it exists in the kraken2 database. During the creation of the nt_ktaxonomy file using make_ktaxonomy.py, the names.dmp and nodes.dmp files should be from the taxonomy that was used to build the kraken2 database.

If the dmp files are absent for the creation of the ktaxonomy file, we can download the version that is the most similar (the closest date) to those used to build the kraken2 database from https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump_archive/

djbradshaw2 commented 2 years ago

Thanks for your comment! I will test this out as soon as I can. Glad you were able to figure it out though!