Open LunavdL opened 5 years ago
Luna, I recommend doing the following, illustrated with the GlobalPatterns
example data. First, some setup:
library(phyloseq)
library(dplyr)
library(ggplot2)
data(GlobalPatterns)
ps <- GlobalPatterns
# Make sure to convert to proportions before computing Bray-Curtis dissimilarity
ps.ra <- ps %>%
transform_sample_counts(function (x) x / sum(x))
We'll add the taxon / OTU label to the tax_table
, allowing us to filter by it later on (see here)
tax_table(ps.ra) <- cbind(tax_table(ps.ra), OTU = taxa_names(ps.ra))
and make a list of the taxa we'll want to plot. I'm just going to pick the first 10 taxa for this example.
plot_otus <- taxa_names(ps.ra)[1:10]
From here, there are a couple ways we could go. Sticking closest to what you've done above, we'll make the sample plot:
ordu <- ordinate(ps.ra, "PCoA", "bray")
p <- plot_ordination(ps.ra, ordu, type="samples")
to which we'll add the taxa. We get the dataframe for the "taxa" plot using all taxa, and then filter to just the taxa we want to plot:
taxdf <- plot_ordination(ps.ra, ordu, type="taxa", justDF = TRUE)
taxdf <- taxdf %>%
filter(OTU %in% plot_otus)
and add these to the samples ordination
p +
geom_point(data = taxdf, color = "blue", shape = 3, size = 3)
Alternately, you could use the biplot
option,
p <- plot_ordination(ps.ra, ordu, type="biplot")
p$data <- p$data %>%
filter((id.type == "Samples") | (OTU %in% plot_otus))
p
That works perfectly - thank you for your help!
Hi,
how dou get a label plotted (i.e. OTU or LCA rank) to the plotted points from the subset. when I use
p + geom_point(data = taxdf, color = "black", shape = 1)+geom_text(data=taxdf, label=taxdf$LCA) Error in FUN(X[[i]], ...) : object 'Treatment' not found
It seems to interfere with the main phyloseq ordination plot in which Treatment has been used in a label.
Thank you in advance.
Kind regards,
T.
Hi Joey, I have a microbial dataset with taxa (species) and samples and I'm plotting these in a PCoA based on Bray-Curtis. Based on the PERMANOVA, I selected 20 taxa out of >5600 that are most important for the significant differences between treatments. So far I can plot either species or samples, or show these in split graphs, but I would like to make an ordination plot that shows all the samples and the 20 selected taxa in one plot.
I tried to make this work with subset_taxa(), but that also alters the ordination of the samples in the plot, while I would like the plot to be based on the full dataset, and just show the position of a few taxa in addition.
With vegan, I could add this with:
where "sel" was the list of names I wanted to show.
Is this also possible with phyloseq? I really like phyloseq and would prefer to be able to do it all with one package!
This is the piece of code I have so far:
Cheers, Luna