edunford / tidysynth

A tidy implementation of the synthetic control method in R
Other
98 stars 14 forks source link

How to use `tidysynth` functions in a custom function? #12

Closed etiennebacher closed 2 years ago

etiennebacher commented 2 years ago

Hello, thank you for this package!

I have a question about embedding tidysynth functions in a custom function. For example, suppose that I define custom_scm() that runs the tidysynth workflow but allows the user to provide a specific outcome:

custom_scm <- function(outcome) {

  smoking %>%
    synthetic_control(outcome = outcome, 
                      unit = state, 
                      time = year,
                      i_unit = "California", 
                      i_time = 1988,
                      generate_placebos=T 
    ) %>% 
    generate_predictor(time_window = 1980:1988,
                       ln_income = mean(lnincome, na.rm = T)) %>% 
    generate_weights(optimization_window = 1970:1988) %>% 
    generate_control()
} 

custom_scm("cigsale")

#> Error in `dplyr::select()`:
#> ! Column `outcome` not found in `.data`.
#> Run `rlang::last_error()` to see where the error occurred.

I tried with dplyr's {{}} but it also doesn't work:

custom_scm <- function(outcome) {

  smoking %>%
    synthetic_control(outcome = {{outcome}}, 
                      unit = state, 
                      time = year,
                      i_unit = "California", 
                      i_time = 1988,
                      generate_placebos=T 
    ) %>% 
    generate_predictor(time_window = 1980:1988,
                       ln_income = mean(lnincome, na.rm = T)) %>% 
    generate_weights(optimization_window = 1970:1988) %>% 
    generate_control()
} 

custom_scm("cigsale")

#> Error in `dplyr::select()`:
#> ! Column `"cigsale"` not found in `.data`.
#> Run `rlang::last_error()` to see where the error occurred.

Same thing if I use !!outcome. Do you know how to do something like that?


Edit: the problem comes from synthetic_control(), and more particularly from the meta information where the name of the outcome is "outcome" instead of "cigsale":

x <- smoking %>%
    synthetic_control(outcome = outcome, 
                      unit = state, 
                      time = year,
                      i_unit = "California", 
                      i_time = 1988,
                      generate_placebos=T 
    )

x$.meta[[1]]

#> # A tibble: 1 x 5
#>   unit_index time_index treatment_unit treatment_time outcome
#>   <chr>      <chr>      <chr>                   <dbl> <chr>  
#> 1 state      year       California               1988 outcome
etiennebacher commented 2 years ago

Here's the solution, in case it helps someone:

library(tidysynth)
library(rlang)

custom_scm <- function(outcome) {
  smoking %>%
    synthetic_control(outcome = !!sym(outcome), 
                      unit = state, 
                      time = year,
                      i_unit = "California", 
                      i_time = 1988,
                      generate_placebos=T 
    ) %>% 
    generate_predictor(time_window = 1980:1988,
                       mean(!!sym(outcome), na.rm = TRUE)) %>% 
    generate_weights(optimization_window = 1970:1988) %>% 
    generate_control()
} 

custom_scm("cigsale")
#> # A tibble: 78 x 11
#>    .id        .placebo .type .outcome .predictors .synthetic_cont~ .unit_weights
#>    <chr>         <dbl> <chr> <list>   <list>      <list>           <list>       
#>  1 Alabama           1 trea~ <tibble> <tibble>    <tibble>         <tibble>     
#>  2 Alabama           1 cont~ <tibble> <tibble>    <tibble>         <tibble>     
#>  3 Arkansas          1 trea~ <tibble> <tibble>    <tibble>         <tibble>     
#>  4 Arkansas          1 cont~ <tibble> <tibble>    <tibble>         <tibble>     
#>  5 California        0 trea~ <tibble> <tibble>    <tibble>         <tibble>     
#>  6 California        0 cont~ <tibble> <tibble>    <tibble>         <tibble>     
#>  7 Colorado          1 trea~ <tibble> <tibble>    <tibble>         <tibble>     
#>  8 Colorado          1 cont~ <tibble> <tibble>    <tibble>         <tibble>     
#>  9 Connectic~        1 trea~ <tibble> <tibble>    <tibble>         <tibble>     
#> 10 Connectic~        1 cont~ <tibble> <tibble>    <tibble>         <tibble>     
#> # ... with 68 more rows, and 4 more variables: .predictor_weights <list>,
#> #   .original_data <list>, .meta <list>, .loss <list>

Created on 2022-04-08 by the reprex package (v2.0.1)

This example shows that !!sym(outcome) can also be used in generate_predictor().