grunwaldlab / metacoder

Parsing, Manipulation, and Visualization of Metabarcoding/Taxonomic data
http://grunwaldlab.github.io/metacoder_documentation
Other
134 stars 28 forks source link

Beginner issue #299

Open Asajoh opened 3 years ago

Asajoh commented 3 years ago

I'm trying to move my workflow from Qiime2 to R and I'd like to try to construct a heat tree using metacoder. However, I'm unsure about which output files I need from Qiime2. I can import whatever is needed using qiime2R, but which files exactly am I looking for?

Can for example DADA2 tables be used? Do I need to create a phylosec object?

In you PLOS paper, you import a .fa file. Is it necessary to go all the way back to raw data?

zachary-foster commented 3 years ago

I can import whatever is needed using qiime2R, but which files exactly am I looking for?

You will need the taxonomic data at a minimum and the abundance matrix if you want to plot relative abundance or filter taxa to be plotted by abundance. I have not used Qiime2 or qiime2R, but from a quick look over the qiime2R readme, it looks like there are multiple ways to do it. You could get your abundance matrix and taxonomy table with read_qza, combine them based on the sequence hash ID (look up "join" operations fora robust way to combined to tables), and convert them to a taxmap object using parse_tax_data. The quickest and easiest way would probably be to convert it to a phyloseq object and then convert that to a taxmap object using parse_phyloseq. Something like:

physeq<-qza_to_phyloseq(
    features="inst/artifacts/2020.2_moving-pictures/table.qza",
    tree="inst/artifacts/2020.2_moving-pictures/rooted-tree.qza",
    taxonomy="inst/artifacts/2020.2_moving-pictures/taxonomy.qza",
    metadata = "inst/artifacts/2020.2_moving-pictures/sample-metadata.tsv"
    )
my_taxmap <- parse_phyloseq(physeq)

Can for example DADA2 tables be used?

Yes, that is what I use for my analyses. Any source of taxonomic data can be used, but some are easier to use than others. Tables of taxonomic like those produced by qiime2R::parse_taxonomy() or dada2 are generally easiest. You can also parse classification strings, like the headers of reference sequence databases.

Do I need to create a phylosec object?

No, but that might be a simplifying intermediate step in some instances as described above.

In you PLOS paper, you import a .fa file. Is it necessary to go all the way back to raw data?

No, I was probably plotting information from a reference database in that example, rather than from metabarcoding results. I usually start with an abundance matrix with a column containing taxonomic information and a separate sample metadata table.