phyloseq is a set of classes, wrappers, and tools (in R) to make it easier to import, store, and analyze phylogenetic sequencing data; and to reproducibly share that data and analysis with others. See the phyloseq front page:
Hi, I'm comparing a set of biological samples that are classified into two groups of interest called "Status" (1 and 0). I had the following phyloseq object:
mirl_phyloseq_final
phyloseq-class experiment-level object
otu_table() OTU Table: [ 8229 taxa and 42 samples ]
sample_data() Sample Data: [ 42 samples by 6 sample variables ]
tax_table() Taxonomy Table: [ 8229 taxa by 6 taxonomic ranks ]
Then I agglomerated it by family:
mirl_phyloseq_final.family=tax_glom (mirl_phyloseq_final, taxrank="Family")
mirl_phyloseq_final.family
phyloseq-class experiment-level object
otu_table() OTU Table: [ 364 taxa and 42 samples ]
sample_data() Sample Data: [ 42 samples by 6 sample variables ]
tax_table() Taxonomy Table: [ 364 taxa by 6 taxonomic ranks ]
And then merged it with psmelt:
melted.f=psmelt (mirl_phyloseq_final.family)
So, I got a data.frame (melted.f) with the samples, the found ASVs, the abundance of each one, and the taxonomic classification agglomerated by family that looks like this:
If I want to calculate the differential abundance between the two groups and run Wilcoxon for a family, I do the following:
wilcox.test (Abundance~Status, data=melted.f[which(melted.f$Family=="Mycoplasmataceae"),]).
And I obtain a p-value.
My intention is to run the Wilcoxon test to calculate differential abundance between the two groups of samples, for all families (in a loop or something like) this and then store those p-values in a data frame with the name of each family.
Hi, I'm comparing a set of biological samples that are classified into two groups of interest called "Status" (1 and 0). I had the following phyloseq object:
mirl_phyloseq_final phyloseq-class experiment-level object otu_table() OTU Table: [ 8229 taxa and 42 samples ] sample_data() Sample Data: [ 42 samples by 6 sample variables ] tax_table() Taxonomy Table: [ 8229 taxa by 6 taxonomic ranks ]
Then I agglomerated it by family: mirl_phyloseq_final.family=tax_glom (mirl_phyloseq_final, taxrank="Family")
mirl_phyloseq_final.family phyloseq-class experiment-level object otu_table() OTU Table: [ 364 taxa and 42 samples ] sample_data() Sample Data: [ 42 samples by 6 sample variables ] tax_table() Taxonomy Table: [ 364 taxa by 6 taxonomic ranks ]
And then merged it with psmelt: melted.f=psmelt (mirl_phyloseq_final.family) So, I got a data.frame (melted.f) with the samples, the found ASVs, the abundance of each one, and the taxonomic classification agglomerated by family that looks like this:![abundance](https://github.com/joey711/phyloseq/assets/159451957/3e0ed762-8368-4706-bbea-49a9874b4aad)
If I want to calculate the differential abundance between the two groups and run Wilcoxon for a family, I do the following: wilcox.test (Abundance~Status, data=melted.f[which(melted.f$Family=="Mycoplasmataceae"),]). And I obtain a p-value. My intention is to run the Wilcoxon test to calculate differential abundance between the two groups of samples, for all families (in a loop or something like) this and then store those p-values in a data frame with the name of each family.
Any idea how I can do it?