qiyunzhu / woltka

Woltka: a versatile meta'omic data classifier
BSD 3-Clause "New" or "Revised" License
71 stars 25 forks source link

How to get the phylogenetic tree after following the recommended process to get table.biom #185

Open diego00012138 opened 1 year ago

diego00012138 commented 1 year ago

Hi, The table.biom file seems to be just a feature table, so how do I get a similar feature otu sequence, like a 16s analysis. I'm also confused by the tree.qza used in the ogu.md for "OGU analysis using QIIME 2", shouldn't the tree here be generated with my own feature otu. I appreciate it if you can answer me. Sincerely

qiyunzhu commented 1 year ago

Hello @diego00012138 Thanks for your interest in the program. There is no feature sequence associated with table.biom. The feature is a reference genome in a pre-defined database. You can query the database to get the genome sequence and/or its metadata. The phylogenetic tree tree.qza is also provided by the database. No new tree will be generated in a Woltka analysis. This is different from a typical 16S rRNA OTU/ASV analysis.

If you intend to directly infer genome sequences in your samples and their phylogenetic relationships, you may consider de novo assembly of the metagenomes, followed by a variety of analyses, including phylogenetic reconstruction, using programs such as PhyloPhlAn, or phylogenetic placement, using our new program DEPP. The entire assembly workflow will involve many more steps than read mapping, with improved resolution but reduced sensitivity. Here is an example workflow from one of our other programs.

Hope it helps!

mpdoane2 commented 1 week ago

Hi, I am following up with this question from @diego00012138. I was also interested in obtaining the tree with only the identified G######## IDs for each of my samples. Since Woltka does not provide a tree for each sample, I filtered the original reference tree (WoL_database/trees/tree.nwk) for the G###### ids. I used this rough command (I'm not great with python), and I was able to get my per_sample_tree.nwk file. I was interested in taking the output of WoLtka over to iCAMP to predict community assemblage processes. I hope this helps as a framework for getting that result post WoLtka output. https://github.com/mpdoane2/Metagenomic_Phylogenetics/blob/main/parsing_newick_per_sample.py

Thank you for organizing this cool tool!