DerrickWood / kraken2

The second version of the Kraken taxonomic sequence classification system
MIT License
711 stars 271 forks source link

Getting the full taxonomic lineage string #316

Open ursky opened 4 years ago

ursky commented 4 years ago

Dear developers,

Kraken1's translate feature produced an incredibly useful summary of the taxonomy of each sequence by providing the full taxonomic lineage (e.g.cellular organisms;Bacteria;Proteobacteria;Gammaproteobacteria). Kraken2 on the other hand produces only the leaf taxonomy (the finest rank classified), even with the --use-names flag. Is there a way of getting the full lineage? I like kraken2, but I am forced to revert to using kraken1 because of this feature.

Thank you in advance!

zmunro commented 3 years ago

I recently wrote a script to do exactly this using the regular output and the report from kraken2. I have a PR open to merge the script into the KrakenTools repo, but you can find the code on my fork of the repo here.

isardi commented 3 years ago

You can also use the flag: --use-mpa-style that can be used in conjunction with --report. This option provides output in a format similar to MetaPhlAn's output with the full taxonomic lineage. -I

zmunro commented 3 years ago

@isardi the problem with using the --use-mpa-style flag is that it does not give you the read IDs with each lineage, it just gives you all the full taxonomic lineages by themselves.

B-1991-ing commented 1 year ago

@isardi the problem with using the --use-mpa-style flag is that it does not give you the read IDs with each lineage, it just gives you all the full taxonomic lineages by themselves.

Hi,

I am trying to use the kraken2_translate.py (https://github.com/zmunro/KrakenTools#kraken2_translatepy). But KRAKEN2DB=database.kraken THREADNUM=10 KRAKEN2_REPORT=nt.kraken.report KRAKEN2_CLASSIFICATION=nt.kraken.classification /services/tools/kraken/20230522/kraken2 --db $KRAKEN2DB --threads $THREADNUM --report $KRAKEN2_REPORT > $KRAKEN2_CLASSIFICATION

But failed due to error,

Screenshot 2023-05-27 at 10 43 24

Do you have any idea about the error?

Thank you very much.

Best,

Bing