Open LeonardosMageiros opened 3 years ago
This is the gplot code I used and I made a less than 5% category
plot<-transform_sample_counts(m.aa, function(x) 100 * x/sum(x)) glom <- tax_glom(plot, taxrank = 'class') glom # should list taxa as phyla data_glom<- psmelt(glom) # create dataframe from phyloseq object
data_glom$class <- as.character(data_glom$class) #convert to character
data_glom$class[data_glom$Abundance < 0.05] <- "< 5% abundance"
Count = length(unique(data_glom$class)); Count
unique(data_glom$class) #add into line below
data_glom$class <- factor(data_glom$class, levels = c("Bacilli", "Clostridia", "Gammaproteobacteria", "Bacteroidia","Betaproteobacteria", "Actinobacteria", "Flavobacteriia", "Erysipelotrichi",
"Coriobacteriia", "Fusobacteriia", "Epsilonproteobacteria", "Alphaproteobacteria" ,
"Spirochaetes", "< 5% abundance"))
spatial_plot <- ggplot(data_glom, aes(x=Sample, y=Abundance, fill=class)) + facet_grid(~decomp.stage, scales = "free")
spatial_plot + geom_bar(aes(), stat="identity", position="stack") + labs (title = "Abdominal Area", x="Stage of Decomposition", y= "Relative Abundance", ) + scale_fill_manual(values = c("springgreen3","royalblue4","gold1","darkorchid","tomato3","deeppink","firebrick","saddlebrown","cyan2","darkseagreen2","salmon2", "lemonchiffon1","darkorange2","gray47","navy","plum","lightblue4","black","cornsilk2"))+ theme(legend.position="bottom", plot.title = element_text(hjust=0.5)) + guides(fill=guide_legend(nrow=4))
Hi!
I have a taxonomic table with the read counts of 14 samples mapped down to the species level. Below is a small part of this table:
These samples are grouped in 4 different categories based on their locations.
What I would like to do is to plot a barplot that shows the the changes in the 10 most abundant Classes in my data.
The first thing I do is to normalise my counts like so:
Then following a tutorial I found online I do the following:
Nonetheless if I understand correctly this code it just keeps my 300 most abundant species which in my case correspond to ~15 Classes.
Plotting this using the following command:
plot_bar(pseq, fill="genus_10") + facet_wrap(as.formula(paste("~", column)), scales="free_x", nrow=1) + geom_bar(aes(color=genus_10, fill=genus_10), stat="identity", position="stack", show.legend = FALSE) + theme(axis.text.x=element_blank()) + theme( legend.title = element_text(size = 12), legend.text = element_text(size = 8) )
Gives me that figure:
Nonetheless this is not what I want to do because y axis is not relative abundance, I have not actually the 10 most predominant classes but the classes that correspond to my 300 most predominant species, and ideally I would like to have the NA gone.
I was able to summarize my groups using merge_samples and create percentages using the following
To produce using the following plot function the following:
plot_bar(pseq_fraction, fill = "genus_10") + geom_bar(aes(color=genus_10, fill=genus_10), stat="identity", position="stack")
So my main question is if there is a way to correctly find my top 10 predominant Classes based on all my species and plot those in a plot similar to my last one? Would it be possible to remove the percentage of the NA in the same plot as well?
Thank you in advance and apologies for the long post.
Best Leonardos