Open Stabilo12 opened 5 years ago
Do you want the standard deviation of the proportion for each taxon across samples within each group? Using the dplyr
package you could do
library(dplyr)
# Assume `ps` is a phyloseq object that has been glommed to the Family level
# and converted to proportions.
tb <- psmelt(ps) %>%
as_tibble
tb0 <- tb %>%
group_by(OTU, Intervention_group) %>%
summarize(Mean = mean(Abundance), SD = sd(Abundance)) %>%
ungroup
to get a dataframe tb0
with the mean and standard deviation of the proportion of each OTU / Family across samples within each Intervention_group
. Note, you should be careful about renaming the OTUs by their Family ignoring the higher level classifications, since the Family name might not be unique. You need to check first:
tax_table(ps)[,"Family"] %>% anyDuplicated
For instance, there are multiple cases in the GlobalPatterns
example dataset where the "Family" name is not enough to specify a family.
data(GlobalPatterns)
ps <- tax_glom(GlobalPatterns, "Family")
tax <- tax_table(ps)
tax[duplicated(tax[,"Family"]),"Family"] %>% unique
#> Taxonomy Table: [6 taxa by 1 taxonomic ranks]:
#> Family
#> 96014 "Alteromonadaceae"
#> 117364 "Piscirickettsiaceae"
#> 530809 "Chromatiaceae"
#> 7609 "Thiotrichaceae"
#> 182827 "Sinobacteraceae"
#> 534609 "Rhodobacteraceae"
tax[which(tax[,"Family"] == "Piscirickettsiaceae"),]
#> Taxonomy Table: [3 taxa by 7 taxonomic ranks]:
#> Kingdom Phylum Class
#> 235822 "Bacteria" "Proteobacteria" "Gammaproteobacteria"
#> 117364 "Bacteria" "Proteobacteria" "Gammaproteobacteria"
#> 574649 "Bacteria" "Proteobacteria" "Gammaproteobacteria"
#> Order Family Genus Species
#> 235822 "Oceanospirillales" "Piscirickettsiaceae" NA NA
#> 117364 "Thiotrichales" "Piscirickettsiaceae" NA NA
#> 574649 "Methylococcales" "Piscirickettsiaceae" NA NA
This is great! Thanks a lot!
Hello I would like to ask how I can include the total abundance in this table Thanks a lot
Hi,
I'm quite new to R and Phyloseq and have a question, which is probably really easy, but I have no clue how to calculate a standard deviation of the taxa in a phyloseq object And I couldn't find the answer online..
I'm albe to calculate the abundance of the taxa (on family level) per intervention group:
ps.taxa <- tax_glom(ps, taxrank = "Family") ps.taxa <-merge_samples(ps.taxa, "Group") ps.taxa <- transform_sample_counts(ps.taxa , function(x) x/sum(x)) ps.dat <- data.frame(otu_table(ps.taxa)) names(ps.dat) <- ((ps.taxa@tax_table@.Data) [, "Family"])
How can I get the SD? Thanks in advanced!