DeclareDesign / estimatr

estimatr: Fast Estimators for Design-Based Inference
https://declaredesign.org/r/estimatr
Other
131 stars 20 forks source link

`model.frame()` gives unexpected results with `iv_robust()` #397

Open ngreifer opened 2 years ago

ngreifer commented 2 years ago

Hello,

I have found that model.frame() gives unexpected results when used with iv_robust(), possibly due to it being unable to parse the model formula. With a factor variable in the model, no dataset is produced, and without factor variables, the formula is misinterpreted, with the | interpreted as "or". See reprex below:

#Using lalonde dataset in MatchIt
fit <- estimatr::iv_robust(re78 ~ treat + age + race | educ + age + race,
                           data = MatchIt::lalonde)

# With factors: failure
head(model.frame(fit))
#> Warning in Ops.factor(treat + age, race): '+' not meaningful for factors
#> Warning in Ops.factor(educ + age, race): '+' not meaningful for factors
#> [1] re78                                  
#> [2] treat + age + race | educ + age + race
#> <0 rows> (or 0-length row.names)

fit <- estimatr::iv_robust(re78 ~ treat + age | educ + age,
                           data = MatchIt::lalonde)

# Without factors: misinterpretation
head(model.frame(fit))
#>            re78 treat + age | educ + age
#> NSW1  9930.0460                     TRUE
#> NSW2  3595.8940                     TRUE
#> NSW3 24909.4500                     TRUE
#> NSW4  7506.1460                     TRUE
#> NSW5   289.7899                     TRUE
#> NSW6  4056.4940                     TRUE

Created on 2022-08-29 with reprex v2.0.2

This fix is needed to fully resolve #374, because as of right now there is no clear way to extract the original dataset from the model. Thanks!

nfultz commented 2 years ago

Apparently the method dispatches differently for iv_robust than lm_robust - please try using the lm method explicitly, like:

stats:::model.frame.lm(fit)

This is another edge case for #123

ngreifer commented 2 years ago

Thank you, unfortunately, this is for use inside insight::get_data() which I'm aware you have no control over. That function simply calls model.frame() so ideally the method for iv_robust objects would naturally dispatch correctly.

nfultz commented 2 years ago

It's weird, someone in the other thread said get_data used to work? https://github.com/DeclareDesign/estimatr/issues/123#issuecomment-492163357

Maybe it changed?

ngreifer commented 1 year ago

It actually produced incorrect output in that thread, as well (note the second column).