Open dicook opened 9 years ago
These seem to be ggplot2 graphics: we do not have them in vegan. @gavinsimpson develops a supplementary package ggvegan for ggplot2 graphics for vegan. Some of these functions could find their home in that package.
Right, but I don't think there is anything here that would stop us creating equivalent plots in base graphics or Lattice which we do have in vegan. It doesn't look as if @dicook is doing anything too fancy (no faceting for example), which would be difficult to do nicely in base graphics.
(I think @dicook just used what she is most familiar with to describe the suggested plots - I don't think the intention was for us to take the examples verbatim into vegan.)
I don't think we would want to have dependencies on all those extra packages just for a few diagnostic plots. But ggvegan could certainly house facsimiles of these plots. I'll work on base graphics versions as well. Perhaps the most difficult issue will be to come up with suitable function names for the data generation/prep steps and/or plot functions...
Thoughts or suggestions for function names?
I once considered the dotchart that was @dicook 's last example:
This used unlabeled species and sites, because I wanted to keep the ordination scores instead of their ranks, but rank-based plots with labelled rows and columns could be easy, too, to give a similar plot as (qplot(Site, Species, data=d.g, size=Count) + scale_size_area()
). In fact, tabasco
is a tile alternative of @dicook 's alternative plot (but it has no option of automatically boxing occurrences to the upper right corner). I called this function ordidotchart
in my privacy.
The boxplot would be easy in conventional graphics, too.
@dicook, the vegan graphical repertoire is far from complete. We have plot types that we are used to work with, but these kind of boxplots and barplots are not the ones I'm used with in multivariate context, and I'm not quite sure what I should read from these plots. If code + documentation for such plots is submitted, we will certainly consider including it in vegan. However, ggplot2 graphics will not be included in vegan, but should go to ggvegan (where @gavinsimpson is the arbiter), and strong reasons are needed for adding new dependencies to vegan.
The last one of the suggested new plots is something that resembles tabasco
(a more colourful sister to vegemite
). The tabasco
plots build on heatmap
which is really rigid, and it seems that ggplot2 based functions could be much more versatile.
library(vegan)
The dune data set is not the most ideal, but since it is used for illustration purposes in the package, we will use it here to show plots that would be good to add to the package
data(dune) dim(dune) dune[1,] dune[,1]
Load other libraries
library(ggplot2) library(dplyr) library(tidyr)
Plot the marginals for both Species and Site
Purpose is to examine the difference in abundances across Species and Sites. Any with numbers too small might be better to remove for analysis
dune$Site <- as.character(1:20) d.g <- gather(dune, Species, Count, -Site) d.m <- summarise(group_by(d.g, Species), med = median(Count, na.rm=T)) d.g$Species <- factor(d.g$Species, levels=d.m$Species[order(d.m$med)])
Use a boxplot on the raw scale.
qplot(Species, Count, data=d.g, geom="boxplot") + coord_flip() d.s <- summarise(group_by(d.g, Species), sum = sum(Count, na.rm=T)) d.g$Species <- factor(d.g$Species, levels=d.s$Species[order(d.s$sum)])
Show as a barchart of counts across sites
qplot(Species, weight=Count, data=d.g, geom="bar") + coord_flip()
d.m2 <- summarise(group_by(d.g, Site), med = median(Count, na.rm=T)) d.g$Site <- factor(d.g$Site, levels=d.m2$Site[order(d.m2$med)])
Use a boxplot on the raw scale.
qplot(Site, Count, data=d.g, geom="boxplot") + coord_flip() d.s2 <- summarise(group_by(d.g, Site), sum = sum(Count, na.rm=T)) d.g$Site <- factor(d.g$Site, levels=d.s2$Site[order(d.s2$sum)])
Show as a barchart of counts across species
qplot(Site, weight=Count, data=d.g, geom="bar") + coord_flip()
Use a fluctuation diagram to show species by count
The purpose is to see species that are prevalent everywhere, vs specialist, sites which only host particular species, vs sites where most species can be found. Looking at associations between sites and species. These plots can help to digest and diagnose the MDS results
qplot(Site, Species, data=d.g, size=Count) + scale_size_area() qplot(Site, Species, data=d.g, fill=Count, geom="tile")