lrberge / fixest

Fixed-effects estimations
https://lrberge.github.io/fixest/
362 stars 59 forks source link

LHS formula macros are expanded into sums with instrumental variables #395

Closed svraka closed 5 months ago

svraka commented 1 year ago

When a formula contains instruments, dot sqaure bracket expansions don’t work for left hand side variables. Without macros in the formula I get the expected multiple estimation results:

library(fixest)
feols(c(mpg, wt) ~ disp + cyl + carb | hp ~ qsec, data = mtcars)
## Standard-errors: IID 
## Dep. var.: mpg
##              Estimate Std. Error  t value   Pr(>|t|)    
## (Intercept) 35.052251   3.347478 10.47124 5.2388e-11 ***
## fit_hp       0.096591   0.073228  1.31904 1.9823e-01    
## disp        -0.047224   0.021549 -2.19148 3.7222e-02 *  
## cyl         -1.721778   1.138336 -1.51254 1.4201e-01    
## carb        -2.695554   1.570846 -1.71599 9.7619e-02 .  
## ---
## Dep. var.: wt
##              Estimate Std. Error   t value  Pr(>|t|)    
## (Intercept)  1.438835   0.760716  1.891420 0.0693417 .  
## fit_hp      -0.035416   0.016641 -2.128197 0.0425977 *  
## disp         0.016430   0.004897  3.355034 0.0023664 ** 
## cyl          0.150458   0.258688  0.581622 0.5656464    
## carb         0.800647   0.356976  2.242860 0.0333149 *

With a macro, instead of expanding it to estimate multiple models, LHS variables are expanded into a sum:

lhs <- c("mpg", "wt")
feols(.[lhs] ~ disp + cyl + carb | hp ~ qsec, data = mtcars)
## TSLS estimation, Dep. Var.: mpg + wt, Endo.: hp, Instr.: qsec
## Second stage: Dep. Var.: mpg + wt
## Observations: 32 
## Standard-errors: IID 
##              Estimate Std. Error   t value   Pr(>|t|)    
## (Intercept) 36.491086   2.820705 12.936866 4.3599e-13 ***
## fit_hp       0.061175   0.061705  0.991423 3.3028e-01    
## disp        -0.030794   0.018158 -1.695920 1.0140e-01    
## cyl         -1.571319   0.959203 -1.638151 1.1299e-01    
## carb        -1.894907   1.323652 -1.431576 1.6374e-01    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 2.96324   Adj. R2: 0.615246
## F-test (1st stage), hp: stat = 4.12322, p = 0.052254, on 1 and 27 DoF.
##             Wu-Hausman: stat = 1.47704, p = 0.23516 , on 1 and 26 DoF.

I couldn’t find any reference in the docs that this should’t work. However, explicitly requesting a comma expansion and wrapping the macro in c():

feols(c(.[, lhs]) ~ disp + cyl + carb | gear | hp ~ qsec, data = mtcars) 
## Standard-errors: Clustered (gear) 
## Dep. var.: mpg
##         Estimate Std. Error   t value Pr(>|t|) 
## fit_hp  0.152104   0.260844  0.583121  0.61880 
## disp   -0.051793   0.089141 -0.581020  0.61998 
## cyl    -1.469573   2.753668 -0.533678  0.64694 
## carb   -4.170337   3.032891 -1.375037  0.30289 
## ---
## Dep. var.: wt
##         Estimate Std. Error   t value Pr(>|t|) 
## fit_hp -0.080043   0.153502 -0.521443  0.65405 
## disp    0.024830   0.044221  0.561505  0.63098 
## cyl     0.420763   1.397060  0.301177  0.79171 
## carb    1.583101   1.980471  0.799356  0.50793

The same happens with regexes as well:

feols(..("^(mpg|wt)$") ~ disp + cyl + carb | hp ~ qsec, data = mtcars) 
## TSLS estimation, Dep. Var.: mpg + wt, Endo.: hp, Instr.: qsec
## Second stage: Dep. Var.: mpg + wt
## Observations: 32 
## Standard-errors: IID 
##              Estimate Std. Error   t value   Pr(>|t|)    
## (Intercept) 36.491086   2.820705 12.936866 4.3599e-13 ***
## fit_hp       0.061175   0.061705  0.991423 3.3028e-01    
## disp        -0.030794   0.018158 -1.695920 1.0140e-01    
## cyl         -1.571319   0.959203 -1.638151 1.1299e-01    
## carb        -1.894907   1.323652 -1.431576 1.6374e-01    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 2.96324   Adj. R2: 0.615246
## F-test (1st stage), hp: stat = 4.12322, p = 0.052254, on 1 and 27 DoF.
##             Wu-Hausman: stat = 1.47704, p = 0.23516 , on 1 and 26 DoF.
feols(c(..(, "^(mpg|wt)$")) ~ disp + cyl + carb | hp ~ qsec, data = mtcars) 
## Standard-errors: IID 
## Dep. var.: mpg
##              Estimate Std. Error  t value   Pr(>|t|)    
## (Intercept) 35.052251   3.347478 10.47124 5.2388e-11 ***
## fit_hp       0.096591   0.073228  1.31904 1.9823e-01    
## disp        -0.047224   0.021549 -2.19148 3.7222e-02 *  
## cyl         -1.721778   1.138336 -1.51254 1.4201e-01    
## carb        -2.695554   1.570846 -1.71599 9.7619e-02 .  
## ---
## Dep. var.: wt
##              Estimate Std. Error   t value  Pr(>|t|)    
## (Intercept)  1.438835   0.760716  1.891420 0.0693417 .  
## fit_hp      -0.035416   0.016641 -2.128197 0.0425977 *  
## disp         0.016430   0.004897  3.355034 0.0023664 ** 
## cyl          0.150458   0.258688  0.581622 0.5656464    
## carb         0.800647   0.356976  2.242860 0.0333149 *
library(fixest)

Created on 2023-02-27 with reprex v2.0.2

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.2.2 (2022-10-31 ucrt) #> os Windows 10 x64 (build 19044) #> system x86_64, mingw32 #> ui RTerm #> language (EN) #> collate Hungarian_Hungary.utf8 #> ctype Hungarian_Hungary.utf8 #> tz Europe/Prague #> date 2023-02-27 #> pandoc 3.1 @ C:/scoop/shims/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> cli 3.6.0 2023-01-09 [1] CRAN (R 4.2.2) #> digest 0.6.31 2022-12-11 [1] CRAN (R 4.2.2) #> dreamerr 1.2.3 2020-12-05 [1] CRAN (R 4.2.2) #> evaluate 0.20 2023-01-17 [1] CRAN (R 4.2.2) #> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.2.2) #> fixest * 0.11.1 2023-01-10 [1] CRAN (R 4.2.2) #> Formula 1.2-5 2023-02-24 [1] CRAN (R 4.2.2) #> fs 1.6.1 2023-02-06 [1] CRAN (R 4.2.2) #> glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.2) #> htmltools 0.5.4 2022-12-07 [1] CRAN (R 4.2.2) #> knitr 1.42 2023-01-25 [1] CRAN (R 4.2.2) #> lattice 0.20-45 2021-09-22 [1] CRAN (R 4.2.2) #> lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.2.2) #> nlme 3.1-162 2023-01-31 [1] CRAN (R 4.2.2) #> numDeriv 2016.8-1.1 2019-06-06 [1] CRAN (R 4.2.0) #> Rcpp 1.0.10 2023-01-22 [1] CRAN (R 4.2.2) #> reprex 2.0.2 2022-08-17 [1] CRAN (R 4.2.2) #> rlang 1.0.6 2022-09-24 [1] CRAN (R 4.2.2) #> rmarkdown 2.20 2023-01-19 [1] CRAN (R 4.2.2) #> sandwich 3.0-2 2022-06-15 [1] CRAN (R 4.2.2) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.2) #> withr 2.5.0 2022-03-03 [1] CRAN (R 4.2.2) #> xfun 0.37 2023-01-31 [1] CRAN (R 4.2.2) #> yaml 2.3.7 2023-01-23 [1] CRAN (R 4.2.2) #> zoo 1.8-11 2022-09-17 [1] CRAN (R 4.2.2) #> #> [1] C:/scoop/apps/r-release/4.2.2/library #> #> ────────────────────────────────────────────────────────────────────────────── ```
lrberge commented 5 months ago

Hello, thanks a lot for reporting and for the very clear reproducible example/issue! Now fixed.