joey711 / phyloseq

phyloseq is a set of classes, wrappers, and tools (in R) to make it easier to import, store, and analyze phylogenetic sequencing data; and to reproducibly share that data and analysis with others. See the phyloseq front page:
http://joey711.github.io/phyloseq/
582 stars 187 forks source link

How to create a phyloseq object & extract top 10 phyla and genus? #1125

Open Lashari37 opened 5 years ago

Lashari37 commented 5 years ago

Hi @mikemc

Thanks for your email.

I am new to R and Phyloseq. However, I need to analyze the 16S rRNA data (please see the attached file for data format). I have the otu-tables & absolute abundance table in an Excel formate. I need to know how to create a phyloseq object from this single file. I have seen all the demo materials/issues but couldn't get through. Could you please advise on that or send codes to generate the phyloseq object from excel file.

Also, I have generated the phyloseq object after importing the BIOM file of this same data. Now, I need to know the following things:

  1. Data rarefaction.
  2. I want to pick the top 10 phyla and genus.
  3. Rearrangement of data according to samples variables (renaming to samples).
  4. How can I check the significance (two factors; p-value) of treatment on the abundance of X spp?

phyloseq-class experiment-level object otu_table() OTU Table: [ 2561 taxa and 12 samples ] tax_table() Taxonomy Table: [ 2561 taxa by 7 taxonomic ranks ] Within this phyloseq, I can not find the sam_table as the sample_table is showing NULL. I don't what does it mean?

I would appreciate your help in this regard.

Cheers Jamal

Data 16S rRNA.pdf

mikemc commented 5 years ago

Your data looks like it is in the old QIIME format. If you turn it into a tab-separated file, then you can import it with the import_qiime() function. The easiest way is to save it as a tab-separated file from within Excel. You can also convert it within R. If you have the tidyverse package installed install.packages("tidyverse"), then you could convert it like this

tb <- readxl::read_excel("data.xlsx")
readr::write_tsv(tb, "data.tsv")

Once you have it as a tab-separated format (extension ".tsv" by convention) then you can import as

library(phyloseq)
physeq <- import_qiime("data.tsv")

Then you want to check the otu table and taxonomy were read correctly,

head(otu_table(physeq))
head(tax_table(physeq))

As far as learning how to do 1 through 4, you might start by taking a look at the phyloseq tutorials and vignettes (run browseVignettes("phyloseq") inside R). For rarefying to an even depth, see help(rarefy_even_depth) but also http://dx.plos.org/10.1371/journal.pcbi.1003531. There are many ways to test for associations in microbiome data. But the article above recommends DESeq2 and Joey has provided a tutorial and a Vignette showing how to do this (I'm not sure which is more up to date)

Lashari37 commented 5 years ago

@mikemc Thanks for your time and suggestion. I need to know how to arrange the sample variables and phylum (or any rank level) for analysis as given below. It might need transpose function I guess. Also, how to rename the samples variables? These may be dumb questions but I need advise, please.

I am looking into the suggested article for more clarifications.

Cheers Jamal

$table Proteobacteria Gemmatimonadetes Firmicutes Actinobacteria 0.60.5.2 7513 2888 998 10075 0.0.5.1 10438 2699 184 10819 0.60.10.1 5756 2857 3175 11374 0.90.5.1 2374 483 4981 13008 0.0.5.2 9633 3240 346 11276 0.60.5.1 6449 2734 2904 11185 0.30.10.2 10157 2816 237 10612 0.0.10.1 9301 3063 222 11142 0.60.10.2 6830 2211 3672 10105 0.30.10.1 9104 2621 352 12337 0.90.10.2 4856 2026 6950 11120 0.30.5.2 10071 2900 296 10177 0.0.10.2 9672 3245 193 10514 0.90.10.1 5612 1368 7790 9306 0.90.5.2 8227 2176 1564 12078 0.30.5.1 9730 3083 297 11042

Lashari37 commented 5 years ago

image

@mikemc What are the codes require to run this (see image) test/analysis?