Closed duttashi closed 5 years ago
Feature to Look For Some things to keep an eye out for when looking at data on a numeric variable:
Histograms
Histogram Basics
Histograms in R There are many ways to plot histograms in R:
hist
function in the base graphics
package;truehist
in package MASS
histogram
in package lattice
geom_histogram
in package ggplot2
Superimposing a Density A histogram can be used to compare the data distribution to a theoretical model, such as a normal distribution. This requires using a density scale for the vertical axis.
Density Plots
Density Plot Basics
Using base graphics, a density plot of the geyser duration variable with default bandwidth
plot(density(geyser$duration))
Grouping and Faceting Both ggplot and lattice make it easy to show multiple densities for different subgroups in a single plot. lattice uses the group argument.
library(lattice)
densityplot(~ yield | site, data = barley)
In ggplot
you can map the group variable to an aesthetic, such as color:
ggplot(barley) + geom_density(aes(x = yield, color = site))
Using fill
and alpha
can also be useful
ggplot(barley) + geom_density(aes(x = yield, fill = site), alpha = 0.2)
Often a more effective approach is to use the idea of small multiples, collections of charts designed to facilitate comparisons. For this we can use the lattice
package. Lattice uses the term lattice plots or trellis plots. These plots are specified using the |
operator in a formula.
densityplot(~ yield | site, data = barley)
Comparison is facilitated by using common axes.
These ideas can be combined:
densityplot(~ yield | site, group = year, data = barley, auto.key = TRUE)
Whereas, ggplot
uses the notion of faceting
ggplot(barley) + geom_density(aes(x = yield)) + facet_wrap(~site)
Again this can be combined with the color aesthetic:
ggplot(barley) + geom_density(aes(x = yield, color = year)) + facet_wrap(~site)
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
A collection of self curated notes to understand data visualization techniques.