carloscinelli / sensemakr

Suite of sensitivity analysis tools for OLS
https://carloscinelli.com/sensemakr/
83 stars 15 forks source link

From the output to the plot with ggplot2? #54

Open BenedicteColnet opened 1 year ago

BenedicteColnet commented 1 year ago

Dear developers,

First, thank you for the really nice package and vignette!! It helped me a lot and ease the transfer to other communities.

I have a question regarding plot. Would it be possible to do the plot from the output of the main function and then using ggplot? From the output it is clear to me how to have the x and y axis and the different points, but I am not 100% about how the contour plot should be done?

Why do I want to use ggplot? Just because it seems more flexible for colors, font size, labels, and so on. :)

Thank you in advance for your help!! Best, Benedicte

lorenzo-crippa commented 2 weeks ago

Hi @BenedicteColnet ! I ended up on this issue this morning while trying to do the same. I don't know if you're still interested in this but I managed to do it. Here is a reprex that should be getting what you need.

First, let's setup the session with necessary packages and the data used by the authors in the package vignette. Then, we run an lm model and we just perform our sensitivity analysis as we would normally do.

# Setup & data input ----
library(sensemakr)
library(tidyverse)
library(ggrepel) # for labelling points without overlapping
library(metR) # for labelling contour lines

data("darfur")

# Estimate model ----
model <- lm(peacefactor ~ directlyharmed + age + farmer_dar + herder_dar +
              pastvoted + hhsize_darfur + female + village, data = darfur)

# Sensitivity analysis ----
sensitivity <- sensemakr(model = model, 
                         treatment = "directlyharmed",
                         benchmark_covariates = "female",
                         kd = 1:3)

Next, we save the output of plot(sensitivity) in an object so that we can access its content and use it for generating our ggplot2 version of the contour plot:

## Extract data from sensemakr object ----
# save plot output in an object to access what's inside:
dat <- plot(sensitivity)

# save dataset on the contour plot:
contour <- dat$value %>% # this is the matrix with the contour values. 
  # first, we turn it into df:
  as.data.frame() %>%
  # add row and column names (x and y values):
  `rownames<-`(dat$r2dz.x) %>%
  `colnames<-`(dat$r2yz.dx) %>%
  rownames_to_column("x") %>%
  # pivot longer:
  pivot_longer(cols = -"x",
               names_to = "y",
               values_to = "estim") %>%
  # make x and y numeric:
  mutate(x = as.numeric(x),
         y = as.numeric(y))

Finally we can do create our ggplot2 contour plot with the data we stored from sensemakr. See an example below that mirrors quite closely the base R plot output:

# Contour plot ----
# first, set the desired breaks among contour lines. 
# Let's follow what the default plot gives, 
# i.e. breaks of 0.05 between -0.2 and 0.05:
breaks <- seq(-0.2, 0.05, by = 0.05)

p <- ggplot() +
  # add layer on the adjusted estimates:
  geom_point(data = dat$bounds,
             mapping = aes(x = r2dz.x, y = r2yz.dx),
             shape = 23, fill = "red", size = 2) +
  # add labels including the adjusted estimate:
  geom_text_repel(data = dat$bounds %>%
                    cbind(estim = sensitivity$bounds$adjusted_estimate),
                  mapping = aes(x = r2dz.x, y = r2yz.dx,
                                label = paste0("Adjusted: ", bound_label,
                                               "\n(", 
                                               # the following controls the number of digits to display:
                                               sprintf("%.2f", estim), 
                                               ")")),
                  seed = 21) +
  # add layer with the unadjusted estimates:
  geom_point(data = sensitivity$sensitivity_stats,
             mapping = aes(x = 0, y = r2yd.x),
             shape = 24, fill = "black", size = 2) +
  # ...and the label:
  geom_text_repel(data = sensitivity$sensitivity_stats,
                  mapping = aes(x = 0, y = r2yd.x,
                                label = paste0("Unadjusted\n(", sprintf("%.2f", estimate), ")")),
                  seed = 21) +
  # now add the contours. Note: we map color and pattern of lines 
  # onto whether the line is at 0, to mirror the original plot. But
  # you could map them onto other substantively meaningful values, too
  geom_contour(data = contour,
               mapping = aes(x = x, y = y, z = estim, # these are the 3D coordinates
                             color = factor(after_stat(level) == 0),
                             linetype = factor(after_stat(level) == 0)),
               breaks = breaks, # the splits defined above
               show.legend = FALSE) +
  # add little labels on the lines that tell us what each contour line is:
  geom_label_contour(data = contour,
                     mapping = aes(x = x, y = y, z = estim,
                                   # map the color of the labels too based on whether they're at 0:
                                   color = factor(after_stat(level) == 0)),
                     breaks = breaks, # same splits as above
                     # next are some arguments for controlling aesthetics:
                     label.size = 0, # I don't want a label box
                     skip = 0, # don't skip any line while labelling
                     show.legend = FALSE) +
  xlab(expression(Partial~R^2~of~confounders~with~the~treatment)) +
  ylab(expression(Partial~R^2~of~confounders~with~the~outcome)) +
  scale_color_manual("", values = c("grey70", "red")) +
  scale_linetype_manual("", values = c("solid", "longdash")) +
  theme_minimal() +
  theme(axis.line = element_line())

Compare the two:

plot(sensitivity)
p # ta-dah