jenniferlu717 / KrakenTools

KrakenTools provides individual scripts to analyze Kraken/Kraken2/Bracken/KrakenUniq output files
GNU General Public License v3.0
315 stars 88 forks source link

kreport2krona does not check last line in the report #34

Open JavierToledoAruba opened 3 years ago

JavierToledoAruba commented 3 years ago

Hello everybody! I wanted to notify some problem with kreport2krona.py:

I have this report for a very simple in-silico metagenomic sample:

21.38 10203 10203 U 0 unclassified 78.62 37509 0 R 1 root 58.09 27715 0 R1 131567 cellular organisms 36.91 17612 0 D 2759 Eukaryota 36.91 17612 0 D1 33154 Opisthokonta 20.90 9974 0 K 4751 Fungi 20.90 9974 0 K1 451864 Dikarya 20.90 9974 0 P 4890 Ascomycota 20.90 9974 0 P1 716545 saccharomyceta 20.90 9974 0 P2 147537 Saccharomycotina 20.90 9974 0 C 4891 Saccharomycetes 20.90 9974 0 O 4892 Saccharomycetales 20.90 9974 0 F 4893 Saccharomycetaceae 20.90 9974 0 G 4930 Saccharomyces 20.90 9974 0 S 4932 Saccharomyces cerevisiae 20.90 9974 9974 S1 559292 Saccharomyces cerevisiae S288C 16.01 7638 0 K 33208 Metazoa 16.01 7638 0 K1 6072 Eumetazoa 16.01 7638 0 K2 33213 Bilateria 16.01 7638 0 K3 33511 Deuterostomia 16.01 7638 0 P 7711 Chordata 16.01 7638 0 P1 89593 Craniata 16.01 7638 0 P2 7742 Vertebrata 16.01 7638 0 P3 7776 Gnathostomata 16.01 7638 0 P4 117570 Teleostomi 16.01 7638 0 P5 117571 Euteleostomi 16.01 7638 0 P6 8287 Sarcopterygii 16.01 7638 0 P7 1338369 Dipnotetrapodomorpha 16.01 7638 0 P8 32523 Tetrapoda 16.01 7638 0 P9 32524 Amniota 16.01 7638 0 C 40674 Mammalia 16.01 7638 0 C1 32525 Theria 16.01 7638 0 C2 9347 Eutheria 16.01 7638 0 C3 1437010 Boreoeutheria 16.01 7638 0 C4 314146 Euarchontoglires 16.01 7638 0 O 9443 Primates 16.01 7638 0 O1 376913 Haplorrhini 16.01 7638 0 O2 314293 Simiiformes 16.01 7638 0 O3 9526 Catarrhini 16.01 7638 0 O4 314295 Hominoidea 16.01 7638 0 F 9604 Hominidae 16.01 7638 0 F1 207598 Homininae 16.01 7638 0 G 9605 Homo 16.01 7638 7638 S 9606 Homo sapiens 21.17 10103 0 D 2 Bacteria 21.17 10103 0 P 1224 Proteobacteria 21.17 10103 0 C 1236 Gammaproteobacteria 21.17 10103 0 O 72274 Pseudomonadales 21.17 10103 0 F 135621 Pseudomonadaceae 21.17 10103 0 G 286 Pseudomonas 21.17 10103 0 G1 136841 Pseudomonas aeruginosa group 21.17 10103 0 S 287 Pseudomonas aeruginosa 21.17 10103 10103 S1 208964 Pseudomonas aeruginosa PAO1 20.53 9794 0 D 10239 Viruses 20.53 9794 0 D1 35237 dsDNA viruses, no RNA stage 20.53 9794 0 O 548681 Herpesvirales 20.53 9794 0 F 10292 Herpesviridae 20.53 9794 0 F1 10293 Alphaherpesvirinae 20.53 9794 0 G 10294 Simplexvirus 20.53 9794 9794 S 10298 Human alphaherpesvirus 1

When I use kreport2krona, the krona file created is the following:

10203 Unclassified 0 kEukaryota 0 kEukaryota pAscomycota 0 kEukaryota pAscomycota cSaccharomycetes 0 kEukaryota pAscomycota cSaccharomycetes oSaccharomycetales 0 kEukaryota pAscomycota cSaccharomycetes oSaccharomycetales fSaccharomycetaceae 0 kEukaryota pAscomycota cSaccharomycetes oSaccharomycetales fSaccharomycetaceae gSaccharomyces 9974 kEukaryota pAscomycota cSaccharomycetes oSaccharomycetales fSaccharomycetaceae gSaccharomyces sSaccharomyces_cerevisiae 0 kEukaryota pChordata 0 kEukaryota pChordata cMammalia 0 kEukaryota pChordata cMammalia oPrimates 0 kEukaryota pChordata cMammalia oPrimates fHominidae 0 kEukaryota pChordata cMammalia oPrimates fHominidae gHomo 7638 kEukaryota pChordata cMammalia oPrimates fHominidae gHomo sHomo_sapiens 0 kBacteria 0 kBacteria pProteobacteria 0 kBacteria pProteobacteria cGammaproteobacteria 0 kBacteria pProteobacteria cGammaproteobacteria oPseudomonadales 0 kBacteria pProteobacteria cGammaproteobacteria oPseudomonadales fPseudomonadaceae 0 kBacteria pProteobacteria cGammaproteobacteria oPseudomonadales fPseudomonadaceae gPseudomonas 10103 kBacteria pProteobacteria cGammaproteobacteria oPseudomonadales fPseudomonadaceae gPseudomonas sPseudomonas_aeruginosa 0 kViruses 0 kViruses oHerpesvirales 0 kViruses oHerpesvirales fHerpesviridae 0 kViruses oHerpesvirales fHerpesviridae g__Simplexvirus

(here you have both of the files) Simple_sample.zip

If you compare both files, you'll notice that the last line in the report, corresponding to Human alphaherpesvirus 1, is not present in the krona file (therefore, not in the html created by krona). I've tried to fix this on my own, but no success so far. Maybe you can help me out here? Thank you for your attention!

jenniferlu717 commented 3 years ago

thanks for letting me know! I'm going to take a look right now

jenniferlu717 commented 3 years ago

I thinkk i uploaded a fix, but let me know if it doesnt work