grantmcdermott / etwfe

Extended two-way fixed effects
https://grantmcdermott.com/etwfe/
Other
50 stars 11 forks source link

Error: The function supplied to the `transform_pre` argument must accept two numeric vectors of predicted probabilities of length 0, and return a single numeric value or a numeric vector of length 0, with no missing value. #27

Closed levialtringer closed 1 year ago

levialtringer commented 1 year ago

Issue

Much thanks for the package! I was in search of a tool to implement the nonlinear DiD approach described by Wooldridge (2022) and this is exactly what I was was hoping to find.

Before applying etwfe() and emfx()to my data, I tried to run the example code provided by the README and vignette. I was unable to replicate any emfx() results due to the following error (and additional warning):

Error: The function supplied to the `transform_pre` argument must accept two numeric vectors of predicted probabilities of length 0, and return a single numeric value or a numeric vector
  of length 0, with no missing value.
In addition: Warning message:
In formals(fun) : argument is not a function

I'm not sure if this is a potential general issue with the recent marginaleffects update or if it is specific to me. I have placed the entirety of my code and output below in addition to a traceback of the error and my session information.

Thanks in advance for any help!


Code

# Load library:
library(etwfe)

# Load data:
data("mpdta", package = "did")

# Inspect data:
head(mpdta)

# Estimate model:
mod =
  etwfe(
    fml  = lemp ~ lpop, # outcome ~ controls
    tvar = year,        # time variable
    gvar = first.treat, # group variable
    data = mpdta,       # dataset
    vcov = ~countyreal  # vcov adjustment (here: clustered)
  )

# Inspect model object:
mod

# Get aggregated ATT:
emfx(mod)

Output

> # Load library:
> library(etwfe)
> 
> # Load data:
> data("mpdta", package = "did")
> 
> # Inspect data:
> head(mpdta)
    year countyreal     lpop     lemp first.treat treat
866 2003       8001 5.896761 8.461469        2007     1
841 2004       8001 5.896761 8.336870        2007     1
842 2005       8001 5.896761 8.340217        2007     1
819 2006       8001 5.896761 8.378161        2007     1
827 2007       8001 5.896761 8.487352        2007     1
937 2003       8019 2.232377 4.997212        2007     1
> 
> # Estimate model:
> mod =
+   etwfe(
+     fml  = lemp ~ lpop, # outcome ~ controls
+     tvar = year,        # time variable
+     gvar = first.treat, # group variable
+     data = mpdta,       # dataset
+     vcov = ~countyreal  # vcov adjustment (here: clustered)
+   )
> 
> # Inspect model object:
> mod
OLS estimation, Dep. Var.: lemp
Observations: 2,500 
Fixed-effects: first.treat: 4,  year: 5
Varying slopes: lpop (first.treat: 4),  lpop (year: 5)
Standard-errors: Clustered (countyreal) 
                                              Estimate Std. Error   t value   Pr(>|t|)    
.Dtreat:first.treat::2004:year::2004         -0.021248   0.021728 -0.977890 3.2860e-01    
.Dtreat:first.treat::2004:year::2005         -0.081850   0.027375 -2.989963 2.9279e-03 ** 
.Dtreat:first.treat::2004:year::2006         -0.137870   0.030795 -4.477097 9.3851e-06 ***
.Dtreat:first.treat::2004:year::2007         -0.109539   0.032322 -3.389024 7.5694e-04 ***
.Dtreat:first.treat::2006:year::2006          0.002537   0.018883  0.134344 8.9318e-01    
.Dtreat:first.treat::2006:year::2007         -0.045093   0.021987 -2.050907 4.0798e-02 *  
.Dtreat:first.treat::2007:year::2007         -0.045955   0.017975 -2.556568 1.0866e-02 *  
.Dtreat:first.treat::2004:year::2004:lpop_dm  0.004628   0.017584  0.263184 7.9252e-01    
... 6 coefficients remaining (display them with summary() or use argument n)
... 10 variables were removed because of collinearity (.Dtreat:first.treat::2006:year::2004, .Dtreat:first.treat::2006:year::2005 and 8 others [full set in $collin.var])
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
RMSE: 0.537131     Adj. R2: 0.87167 
                 Within R2: 8.449e-4
> 
> # Get aggregated ATT:
> emfx(mod)
Error: The function supplied to the `transform_pre` argument must accept two numeric vectors of predicted probabilities of length 0, and return a single numeric value or a numeric vector
  of length 0, with no missing value.
In addition: Warning message:
In formals(fun) : argument is not a function

Traceback

Code:

# Error traceback:
traceback()

Output:

> # Error traceback:
> traceback()
11: stop(format_message(string = string, ..., line_length = line_length, 
        indent = indent), call. = call.)
10: format_alert(..., type = "error")
9: insight::format_error(msg)
8: safefun(hi = predicted_hi, lo = predicted_lo, y = predicted, 
       n = .N, term = term, cross = cross, wts = marginaleffects_wts_internal, 
       tmp_idx = tmp_idx)
7: `[.data.table`(out, , `:=`("estimate", safefun(hi = predicted_hi, 
       lo = predicted_lo, y = predicted, n = .N, term = term, cross = cross, 
       wts = marginaleffects_wts_internal, tmp_idx = tmp_idx)), 
       by = idx)
6: out[, `:=`("estimate", safefun(hi = predicted_hi, lo = predicted_lo, 
       y = predicted, n = .N, term = term, cross = cross, wts = marginaleffects_wts_internal, 
       tmp_idx = tmp_idx)), by = idx]
5: get_contrasts(structure(list(nobs = 2500L, nobs_origin = 2500L, 
       fml = lemp ~ .Dtreat:i(first.treat, i.year, ref = 0, ref2 = 2003)/lpop_dm, 
       call = fixest::feols(fml = Fml, data = data, vcov = ..1, 
           notes = FALSE), call_env = <environment>, method = "feols", 
       method_type = "feols", fml_all = list(linear = lemp ~ .Dtreat:i(first.treat, 
           i.year, ref = 0, ref2 = 2003)/lpop_dm, fixef = ~first.treat + 
           first.treat[[lpop]] + year + year[[lpop]]), fml_no_xpd = lemp ~ 
           .Dtreat:i(first.treat, i.year, ref = 0, ref2 = 2003)/lpop_dm | 
               first.treat[lpop] + year[lpop], fixef.tol = 1e-06, 
       fixef.iter = 10000, nparams = 31, fixef_vars = c("first.treat", 
       "year"), fixef_terms = c("first.treat", "first.treat[[lpop]]", 
       "year", "year[[lpop]]"), slope_flag = c(1L, 1L), slope_flag_reordered = c(1L, 
       1L), slope_variables_reordered = list(lpop = c(5.8967609333053, 
       5.8967609333053, 5.8967609333053, 5.8967609333053, 5.8967609333053, 
       2.23237719795055, 2.23237719795055, 2.23237719795055, 2.23237719795055, 
       2.23237719795055, 1.29828248379668, 1.29828248379668, 1.29828248379668, 
       1.29828248379668, 1.29828248379668, 3.32625829499766, 3.32625829499766, 
       3.32625829499766, 3.32625829499766, 3.32625829499766, 6.24790553432335, 
       6.24790553432335, 6.24790553432335, 6.24790553432335, 6.24790553432335, 
       2.08081559723298, 2.08081559723298, 2.08081559723298, 2.08081559723298, 
    ...
4: do.call("get_contrasts", args)
3: comparisons(model, newdata = newdata, variables = variables, 
       vcov = vcov, conf_level = conf_level, type = type, wts = wts, 
       hypothesis = hypothesis, df = df, by = by, eps = eps, transform_pre = slope, 
       cross = FALSE, internal_call = TRUE, ...)
2: marginaleffects::slopes(object, newdata = dat, wts = "N", variables = ".Dtreat", 
       by = by_var, ...)
1: emfx(mod)

Session Information

Code:

# Session information:
sessionInfo()

Output:

> # Session information:
> sessionInfo()
R version 4.0.4 (2021-02-15)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS 12.0.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] etwfe_0.3.1

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.8.3           lattice_0.20-41        fixest_0.10.4          zoo_1.8-8              marginaleffects_0.10.0 grid_4.0.4             nlme_3.1-152           backports_1.2.1       
 [9] dreamerr_1.2.3         data.table_1.14.2      checkmate_2.0.0        generics_0.1.0         sandwich_3.0-2         Formula_1.2-4          tools_4.0.4            numDeriv_2016.8-1.1   
[17] compiler_4.0.4         insight_0.19.0  
levialtringer commented 1 year ago

SOLVED

My apologies for not checking into this before posting the issue. I reinstalled the most recent version of the etwfe package.

remotes::install_version("etwfe", version = "0.3.1", repos = "http://cran.us.r-project.org")

The crux of the issue appeared to be outdated dependencies. Upon reinstallation, and updating all of the associated dependencies, the emfx() function is now executing without error.

Comparing the sessionInfo() output below to that given above shows the packages that were in need of updating, though I'm not sure which was(were) the culprit.

Again, my apologies and thanks for the package!


New Session Information

> # Session information:
> sessionInfo()
R version 4.0.4 (2021-02-15)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS 12.0.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] etwfe_0.3.1

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.10            lattice_0.20-41        fixest_0.11.1          zoo_1.8-11             marginaleffects_0.10.0 grid_4.0.4             nlme_3.1-152           backports_1.4.1       
 [9] dreamerr_1.2.3         data.table_1.14.8      checkmate_2.1.0        generics_0.1.3         sandwich_3.0-2         Formula_1.2-5          tools_4.0.4            numDeriv_2016.8-1.1   
[17] compiler_4.0.4         insight_0.19.0 
grantmcdermott commented 1 year ago

No problem, and glad the rubber ducking helped.

FWIW version 0.3.1 is now on CRAN so you should be able to install directly rather than going through remotes.