DerrickWood / kraken2

The second version of the Kraken taxonomic sequence classification system
MIT License
707 stars 271 forks source link

How to get phylogenetic tree from kraken or bracken output file? #323

Open ctlshcxy opened 3 years ago

ctlshcxy commented 3 years ago

I'd like to perform diversity analysis based on unifrac distance. however, I have no idea how to get phylogenetic tree conveniently. The taxonomy names and IDs obtained from kraken2 failed to be recognized by phyloT (v2, https://phylot.biobyte.de/) completely.

ctlshcxy commented 3 years ago

@DerrickWood @jenniferlu717 Hi, can you help me with above question?

jenniferlu717 commented 3 years ago

Apologies, I've just finished my dissertation.

I'm not sure exactly what kind of analysis you are looking to do. We are almost finished with the development of both beta diversity and alpha diversity calculation scripts for kraken output files, so that may be of interest.

Exactly what output from kraken2 are you trying to provide to phyloT? I havent used the tool before.

--use-mpa-style will provide an output with the full taxonomy string of each taxon in a given line when used with --report myreport.kreport

smdabdoub commented 3 years ago

@ctlshcxy You can use Mike Lee's tool GToTree. It will take a list of NCBI accession IDs, GenBank files, or even sequence data for each genome and create a phylogenetic tree out of them.

humbleflowers commented 3 years ago

@jenniferlu717 Hello, Have you released those scripts for alpha and beta diversity calculation? I am looking forward to use them on my output files.

smdabdoub commented 3 years ago

@humbleflowers You can use kraken-biom to combine the kraken report files from each sample into a single BIOM table with all the counts per sample and taxonomy strings, and use that in downstream tools like phyloseq or QIIME2 to do diversity and other calculations.