biocore / q2-greengenes2

A QIIME 2 plugin for interaction with the Greengenes2 database
BSD 3-Clause "New" or "Revised" License
26 stars 3 forks source link

Where can I find the relabel mapping files? #25

Open raytaoliu opened 1 month ago

raytaoliu commented 1 month ago

Hi, I am trying to compare wgs and 16s from the same samples. I use qiime2 with silva to process 16S data and I use Woltka with WoL to process WGS data, but when I generate trees, the labels are not the same, can I relabel one of them to the same format? Thanks!

wasade commented 1 month ago

Hi @lrt666, the SILVA 16S taxonomy and phylogeny is not compatible with the Web of Life. I recommend processing the data using a common reference and phylogeny like Greengenes2.

raytaoliu commented 1 month ago

Thanks for your quick response! I tried to process 16S with Greengenes2 using QIIME2 but can we use QIIME2 to process WGS data with Greengenes2 or we need to assign Greengenes2 database to other tools like Woltka?

wasade commented 1 month ago

Greengenes2 includes a subset of the Web of Life version 2. We've only applied Woltka, via the SHOGUN parameter set with bowtie2 (see https://github.com/qiita-spots/qp-woltka/blob/e62a5550c96b0656cee853d96a593a2c43555f8e/qp_woltka/woltka.py#L179-L183), for use with Greengenes2. That said, it's plausible other methods of mapping to the Web of Life would work -- what's needed is to produce a feature table expressed using the Web of Life version 2 OGU identifiers. That table can then be filtered using q2-greengenes2.

Please redirect questions about QIIME 2 though to its forum.

raytaoliu commented 1 month ago

Hi, thanks very much for your patience, I still have one more question: Is there any guidance for us how to use Woltka with Greengenes2?

raytaoliu commented 1 month ago

Hi, I successfully processed Woltka ogu table with greengenes, but when I processed V4 data, can I relabel asv to ogu identifier or id to asv?

wasade commented 1 month ago

With woltka, and assuming the OGU table is mapped against WoLr2, it's just a filter-features action.

For V4 ASVs, they already exist in the phylogeny. The only trick is to use the correct taxonomy and/or phylogeny artifact -- if the ASVs are expressed as sequences, then use the .asv files. If they are hashed then use the .md5

raytaoliu commented 1 month ago

Thanks very much for your quick response and patience! I would like to use V4 region ASVs from my dataset and output a feature table where the feature IDs are formatted like GXXXXX, similar to the format used in Woltka, is that possible?

wasade commented 1 month ago

The IDs associated with the Web of Life r2 database are derived from the Genbank/RefSeq accessions. There is no way to construct those from ASVs, but there also isn't a need as at the end of the day they are just identifiers

raytaoliu commented 1 month ago

Thanks for your clarification, I am processing the same dataset with both 16S and Shotgun data, I want to compare them from the perspective of phylogenetic tree so I want to align their features id.

wasade commented 1 month ago

The feature ID namespace for 16S and shotgun will be different. But as long as the IDs map onto Greengenes2 phylogeny tips, you should be able to proceed with a phylogenetic evaluation.