Open aspina7 opened 3 years ago
Another way of doing this is to modify the call to aes()
and add groupings based on the individual row number in the data.frame
. We start with a function that takes a ggplot geom_histogram object as input, adds groupings to the data and to the aes:
geom_squares <- function(plot) {
# Check if the plot uses geom_histogram
if (!("GeomBar" %in% class(plot$layers[[1]]$geom))) {
stop("The first layer of the input plot must be geom_histogram.")
}
# Make a deep copy of the plot:
plot2 = unserialize(serialize(plot, NULL))
# Add 'grouping' column to the plot data
plot2$data$grouping = as.numeric(row.names(plot2$data))
# Locate mapping and update aesthetics to include group = grouping:
if(!is.null(plot2$mapping)){
# Modify the base aesthetics:
plot2$mapping = modifyList(
plot2$mapping,
aes(group = grouping))
} else {
# Modify the aesthetics in layer 1 (call to geom_histogram()):
plot2$layers[[1]]$mapping = modifyList(
plot2$layers[[1]]$mapping,
aes(group = grouping))
}
# Add white borders to the squares:
plot2$layers[[1]]$aes_params = c(
plot2$layers[[1]]$aes_params,
colour = 'white'
)
# Make sure closure is appropriate for epicurve:
plot2$layers[[1]]$stat_params = c(
plot2$layers[[1]]$stat_params,
closed = 'left'
)
# Rebuild the plot
built = ggplot2::ggplot_build(plot2)
# Return the modified plot
return(plot2)
}
This function will work irrespective of whether the aesthetics are defined in the original call to ggplot()
or in the first layer (call to geom_histogram()
). It will also keep any other parameters that the user has added to the input plot (it just adds the groups and white borders).
Below some example data to test it on:
# Load required libraries:
pacman::p_load(tidyverse)
# Set seed for reproducibility:
set.seed(123)
# Set start date:
start_date <- as.Date("2024-01-01")
# Set end date:
end_date <- as.Date("2024-04-01")
# Create example data.frame:
data <- data.frame(
onset_date = sample(seq(start_date, end_date, by = "day"),
100,
replace = TRUE),
category = sample(c("A", "B"),
100,
replace = TRUE))
# Create epicurve_breaks
epicurve_breaks <- seq.Date(
from = start_date,
to = end_date,
by = "week")
To apply the function:
# Scenario 1: x and fill defined in ggplot() call
p1 <- ggplot(data,
mapping = aes(x = onset_date, fill = category)) +
geom_histogram(breaks = epicurve_breaks)
# Add squares to scenario 1 plot:
p1squared <- geom_squares(p1)
# Print the plot:
p1squared
# Scenario 2: x and fill defined in geom_histogram() layer
p2 <- ggplot(data) +
geom_histogram(mapping = aes(x = onset_date, fill = category),
breaks = epicurve_breaks)
# Add squares to scenario 2 plot:
p2squared <- geom_squares(p2)
# Print the plot:
p2squared
The only problem is the stacks are not ordered by fill column any more, see below:
@aspina7
Fixed the stack order problem - just needed to do fct_reorder()
inside the group
argument:
geom_squares <- function(plot) {
# Check if the plot uses geom_histogram
if (!("GeomBar" %in% class(plot$layers[[1]]$geom))) {
stop("The first layer of the input plot must be geom_histogram.")
}
# Make a deep copy of the plot to avoid changing the original:
plot2 = unserialize(serialize(plot, NULL))
# Add 'grouping' column to the plot data
plot2$data$grouping = as.numeric(row.names(plot2$data))
# Locate mapping and update aesthetics to include group = grouping:
if("x" %in% names(plot2$mapping)){
# Modify the base aesthetics:
plot2$mapping = modifyList(
plot2$mapping,
aes(group = fct_reorder( # stack in order of fill column
factor(grouping),
!!sym(rlang::as_name(plot2$mapping$fill)))))
} else {
# Modify the aesthetics in layer 1 (call to geom_histogram()):
plot2$layers[[1]]$mapping = modifyList(
plot2$layers[[1]]$mapping,
aes(group = fct_reorder( # stack in order of fill column
factor(grouping),
!!sym(rlang::as_name(plot2$layers[[1]]$mapping$fill)))))
}
# Add white borders to the squares:
plot2$layers[[1]]$aes_params = c(
plot2$layers[[1]]$aes_params,
colour = 'white'
)
# Make sure closure is appropriate for epicurve:
plot2$layers[[1]]$stat_params = c(
plot2$layers[[1]]$stat_params,
closed = 'left'
)
# Rebuild the plot
built = ggplot2::ggplot_build(plot2)
# Return the modified plot
return(plot2)
}
which gives:
Another edit - this one needed as previous code did not work for some character vectors. Converting to a factor and then as.numeric makes the sorting of the stacks work more stably for different cases (assumption is that fill variable will be character or factor).
geom_squares <- function(plot) {
# Check if the plot uses geom_histogram
if (!("GeomBar" %in% class(plot$layers[[1]]$geom))) {
stop("The first layer of the input plot must be geom_histogram.")
}
# Make a deep copy of the plot to avoid changing the original:
plot2 = unserialize(serialize(plot, NULL))
# Add 'grouping' column to the plot data
plot2$data$grouping = as.numeric(row.names(plot2$data))
# Locate mapping and update aesthetics to include group = grouping:
if("x" %in% names(plot2$mapping)){
# Modify the base aesthetics:
plot2$mapping = modifyList(
plot2$mapping,
aes(group = fct_reorder( # stack in order of fill column
.f = factor(grouping),
.x = as.numeric(
factor(!!sym(rlang::as_name(plot2$mapping$fill)))))))
} else {
# Modify the aesthetics in layer 1 (call to geom_histogram()):
plot2$layers[[1]]$mapping = modifyList(
plot2$layers[[1]]$mapping,
aes(group = fct_reorder( # stack in order of fill column
.f = factor(grouping),
.x = as.numeric(
factor(!!sym(rlang::as_name(plot2$layers[[1]]$mapping$fill)))))))
}
# Add white borders to the squares:
plot2$layers[[1]]$aes_params = c(
plot2$layers[[1]]$aes_params,
colour = 'white'
)
# Make sure closure is appropriate for epicurve:
plot2$layers[[1]]$stat_params = c(
plot2$layers[[1]]$stat_params,
closed = 'left'
)
# Rebuild the plot
built = ggplot2::ggplot_build(plot2)
# Return the modified plot
return(plot2)
}
Per recent discussions:
Problem statement:
Function to add squares to an existing histogram representing individual or N cases - often requested to replicate the epicurve shown at the top of the Epidemiologist R handbook Epidemic curves chapter (32).
Function should include:
p1 + geom_squares()
Issues to resolve with current (AM's) proposal:
fct_reorder()
and change the class of the group and fill variables to keep squares in correct order Simplified approach from @aspina7 to extract the x axis from the existing plot data.frame and use that to create the groups:
df <- ggplot_build(plot)$data[[1]]
# define squares for plotting over
squaredf <- df[rep(seq.int(nrow(df)), df[["count"]]), ]
squaredf[["count"]] <- 1
squaredf <- mutate(squaredf,
x = as.Date(x, origin = "1970-01-01"))
Converting function to ggplot2 layer:
As explained here this requires two steps:
aes
are selected.Note: documentation can be inherited from ggplot2 and added to, so the idea would be:
ggplot2::layer()
and export the functionThis should allow users to use the plus +
sign to add_squares()
to an existing plot. It makes more sense to call it add...
than geom...
because rather than creating a new geom it is just adding something to an existing one.
Adapted from {incidence} but makes it possible to use directly with {ggplot2} maintaining the use of scale_x_date() functions. Seems to work dates or month (presumably works with whatever geom_histogram() is fed to it.... but need to add tests.
Also need to re-structure so can used it with the ggplot2 + rather than %>%