ebenmichael / augsynth

Augmented Synthetic Control Method
MIT License
144 stars 52 forks source link

Questions about multisynth plot #70

Closed Germancampos closed 2 years ago

Germancampos commented 2 years ago

Hello everyone.

I have two questions, I would really appreciate if someone could help me.

I am estimating a staggered synthetic control. The multisynth plot function "plot(ppool_syn_time_summ) for example" allows me to plot the difference between the observed value of the treated units and the estimates of their counterfactuals, that is, it plots the treatment effect (if I'm not right please tell me). But in addition to obtaining that graph, I want to graph the two terms, that is, I want to obtain a graph in which two time series are presented, one for the observed value of the average treated units and another for the estimates of the average counterfactual. How could he do it with the augsynth package? Or how could I get the values ​​of these two series so I can build the graph?

On the other hand, I would like to know if there is any way to modify the aesthetics of the graphs of the multisynth plot function. I want to change colors, size of the lines, aesthetics of the graphic environment, among others. Is there a way to edit augsynth plots in ggplot2?

I am attentive, thank you very much.

etiennebacher commented 2 years ago

Hi, I second this, it would be very convenient to have a function to plot the trends of the treated unit(s) and of the synthethic control of single_augsynth() and multisynth().

@Germancampos regarding your second point, plot returns a ggplot object, meaning that you can then customize it as you want with other ggplot2 functions (and extensions). Here's an example coming from the multisynth vignette:

library(magrittr)
library(dplyr)
library(augsynth)
library(ggplot2)

data <- read.csv("https://dataverse.harvard.edu/api/access/datafile/:persistentId?persistentId=doi:10.7910/DVN/WGWMAV/3UHTLP", sep="\t")

data %>%
  filter(!State %in% c("DC", "WI"),
         year >= 1959, year <= 1997) %>%
  mutate(YearCBrequired = ifelse(is.na(YearCBrequired), 
                                 Inf, YearCBrequired),
         cbr = 1 * (year >= YearCBrequired)) -> analysis_df

ppool_syn <- multisynth(lnppexpend ~ cbr, State, year, 
                        nu = 0.5, analysis_df)

plot(summary(ppool_syn), levels = "Average", label = FALSE) +
  theme_dark() +
  geom_line(color = "red") +
  geom_point(color = "red") +
  labs(title = "This is my plot")

foo1

Note however that the red line and points are simply put above the default black ones. This means that adding something like linetype = "dashed" in geom_line() will show parts of the original line (see below). Since these lines and points are hardcoded in plot, I don't think it is possible to remove them.

foo2

ebenmichael commented 2 years ago

@etiennebacher that's right. I think you could remove plot elements from the ggplot object and add new ones, but at that point, you should make your own plot directly from the results.

@Germancampos for your first question about plotting on the outcome scale, you can do this by accessing the effect estimates directly. After running the summary function for either single_augsynth or multisynth, you can take the att element of the summary object to get the estimates. E.g. for the example above summary(ppool_syn)$att will give a data frame of all the effect estimates. You can then merge this in with the actual data to graph the outcome series and it's synthetic control.

Here's how that looks for the kansas dataset and single_augsynth:

library(tidyverse)
library(augsynth)
data(kansas)

# fit augsynth and get effect estimates
syn <- augsynth(lngdpcapita ~ treated, fips, year_qtr, kansas)
att <- summary(syn)$att

# combine with data
kansas %>%
  filter(state == "Kansas") %>%
  select(year_qtr, lngdpcapita) %>%
  inner_join(att, by = c("year_qtr"="Time")) %>%
  ggplot(aes(x = year_qtr)) +
    geom_line(aes(y = lngdpcapita)) +
    geom_line(aes(y = lngdpcapita + Estimate), lty = 2, color = "red")

That'll give you something like this

Screen Shot 2022-05-02 at 4 19 22 PM

That could be added in the future. My only caution is that plotting the outcomes directly like this makes it more difficult to see how well the synthetic control is fitting the pre-treatment periods, since it's all washed out in the overall trend in gdp.