MatthieuStigler / Misconometrics

A bunch of small R scripts useful in econometrics
GNU General Public License v3.0
6 stars 4 forks source link

Running dec_covar by specifying formula with a function #4

Closed mlinegar closed 3 years ago

mlinegar commented 3 years ago

@MatthieuStigler is it possible to run dec_covar with a generic formula (specified with a string) from inside of a function? I have been unable to figure out why the following examples (dec_test_1 and dec_test_2) don't work. I can provide the output of my sessionInfo() if that would be helpful, but I've tried to update all packages and tried on two separate systems and am running into the same errors.

The following is a reprex that runs successfully when specifying a particular formula, but not when trying to supply the formula as a function input:

n = 1000
test_data <- data.table(y = rnorm(n), x0 = rnorm(n), x1 = rnorm(n), x2 = rnorm(n), x3 = rnorm(n), x4 = rnorm(n), x5 = rnorm(n), z = as.factor(floor(runif(n, 0, 10))))

# works when running directly
full_model_1 <- felm(y ~ x0 + x1 + x2 + x3 + x4 + x5 | z, data = test_data)
dec <- dec_covar(object = full_model_1, var_main = "x0", format = "long", add_coefs = TRUE, conf.int = TRUE)

# works when supplying string input to formula
full_model_2 <- felm(as.formula("y ~ x0 + x1 + x2 + x3 + x4 + x5 | z"), data = test_data)
dec2 <- dec_covar(object = full_model_2, var_main = "x0", format = "long", add_coefs = TRUE, conf.int = TRUE)

# works when specifying a particular string formula inside a function
dec_test_0 <- function(){
  print("Running felm")
  full_model <- felm(as.formula("y ~ x0 + x1 + x2 + x3 + x4 + x5 | z"), data = test_data)
  print("Running dec_covar")
  dec <- dec_covar(object = full_model, var_main = "x0", format = "long", add_coefs = TRUE, conf.int = TRUE)
  dec
}
dec_test_0()

# doesn't work when supplying string formula inside a function
dec_test_1 <- function(formula_str){
  print("Running felm")
  full_model <- felm(as.formula(formula_str), data = test_data)
  print("Running dec_covar")
  dec <- dec_covar(object = full_model, var_main = "x0", format = "long", add_coefs = TRUE, conf.int = TRUE)
  dec
}
dec_test_1("y ~ x0 + x1 + x2 + x3 + x4 + x5 | z")

# also doesn't work when specifying the environment to the formula inside a function
dec_test_2 <- function(formula_str){
  .env <- environment() ## identify the environment 
  my_formula <- as.formula(formula_str, env = .env)
  print("Running felm")
  full_model <- felm(my_formula, data = test_data)
  print("Running dec_covar")
  dec <- dec_covar(object = full_model, var_main = "x0", format = "long", add_coefs = TRUE, conf.int = TRUE)
  dec
}
dec_test_2("y ~ x0 + x1 + x2 + x3 + x4 + x5 | z")
MatthieuStigler commented 3 years ago

Hi,

It seems to be an issue with update.felm, here is a reprex, showing where the issue comes from... hopefully this can help you figure out what is the problem is/find a workaround!?

library(lfe)
#> Loading required package: Matrix
devtools::source_url("https://raw.githubusercontent.com/MatthieuStigler/Misconometrics/master/Gelbach_decompo/dec_covar.R")
#> ℹ SHA-1 hash of file is ded942a53e385beda6815fb39a5417c6a2c750e1

n = 1000
test_data <- data.frame(y = rnorm(n), x0 = rnorm(n), x1 = rnorm(n), x2 = rnorm(n), x3 = rnorm(n), x4 = rnorm(n), x5 = rnorm(n), z = as.factor(floor(runif(n, 0, 10))))
formula_str = "y ~ x0 + x1 + x2 + x3 + x4 + x5 | z"

## Doesn't work
f <- as.formula(formula_str)
full_model <- felm(f, data = test_data)
dec_covar(object = full_model, var_main = "x0", format = "long", add_coefs = TRUE, conf.int = TRUE)
#> Error: object of type 'symbol' is not subsettable

## Works
full_model2 <- felm(as.formula(formula_str), data = test_data)
dec_covar(object = full_model2, var_main = "x0", format = "long", add_coefs = TRUE, conf.int = TRUE)
#> # A tibble: 5 × 11
#>   covariate    beta_K beta_K_low beta_K_high variable    gamma gamma_low
#>   <chr>         <dbl>      <dbl>       <dbl> <chr>       <dbl>     <dbl>
#> 1 x1         0.000462    -0.0656      0.0665 x0       -0.0196    -0.0803
#> 2 x2         0.00797     -0.0545      0.0704 x0       -0.0285    -0.0927
#> 3 x3        -0.00517     -0.0687      0.0584 x0        0.0310    -0.0321
#> 4 x4         0.0332      -0.0328      0.0991 x0       -0.00923   -0.0700
#> 5 x5         0.0175      -0.0449      0.0799 x0        0.0517    -0.0125
#> # … with 4 more variables: gamma_high <dbl>, delta <dbl>, beta_var_base <dbl>,
#> #   beta_var_full <dbl>

## underlying issue:
update(full_model, as.formula("y ~ x1 | z"))
#> Error: object of type 'symbol' is not subsettable
update(full_model2, as.formula(". ~ .-x5 | ."))
#>         x0         x1         x2         x3         x4 
#> -0.0127555  0.0003232  0.0086616 -0.0061515  0.0336243

## under-underlying issue:
Formula::as.Formula(full_model$call$formula)
#> Error: object of type 'symbol' is not subsettable
Formula::as.Formula(full_model2$call$formula)
#> y ~ x0 + x1 + x2 + x3 + x4 + x5 | z

Created on 2021-07-26 by the reprex package (v2.0.0)

mlinegar commented 3 years ago

Thanks very much for the quick response! For a quick fix, I'm trying to use fixest::feols as a drop-in replacement for lfe::felm. This seems to resolve the issue, and (at least for my toy example) dec_covar produces the same results in both cases for the examples that worked for felm, but runs without error for the examples that did not work with felm.

I'll re-open with a solution if I figure out a work-around for using felm (or please let me know if there are problems with using fixest).

MatthieuStigler commented 3 years ago

Good.

If you have a working version with fixest, do not hesitate to share, other users might be interested in it!