Open lang-benjamin opened 4 years ago
This is by design. margins()
reports effects at the scale of the original variable, not the term generated by I()
.
Thanks for the quick reply! Let me ask one more question about the general principle behind this. So, how about other transformations like log(hp)
, stats::poly(hp, 2)
or splines::ns(hp)
? The first two are not ignored by margins()
whereas the last one will again be ignored.
Thanks. I'll look into that more - the intention is for it to report based on original variable. If that's not happening, it's probably a bug.
Thanks. Sorry again, where is this intention coming from? That essentially means margins
does not allow for non-linear modelling/transformation. Are there specific obstacles when calculating marginal effects for such transformations?
Obviously when it comes to non-linear transformation marginal effect at the scale of the original variable is no longer constant across different values of the original predictor.
Thanks for the quick reply! Let me ask one more question about the general principle behind this. So, how about other transformations like
log(hp)
,stats::poly(hp, 2)
orsplines::ns(hp)
? The first two are not ignored bymargins()
whereas the last one will again be ignored.
Just adding a comment here, because I was confused by this issue. In my testing, all of these transformations are reported on the scale of the original predictor (which is also what I would have expected):
library(margins)
margins(lm(mpg ~ wt + I(wt^2), data = mtcars))
#> Average marginal effects
#> lm(formula = mpg ~ wt + I(wt^2), data = mtcars)
#> wt
#> -5.845
margins(lm(mpg ~ stats::poly(wt, 2), data = mtcars))
#> Average marginal effects
#> lm(formula = mpg ~ stats::poly(wt, 2), data = mtcars)
#> wt
#> -5.845
margins(lm(mpg ~ log(wt), data = mtcars))
#> Average marginal effects
#> lm(formula = mpg ~ log(wt), data = mtcars)
#> wt
#> -5.855
margins(lm(mpg ~ splines::ns(wt), data = mtcars))
#> Average marginal effects
#> lm(formula = mpg ~ splines::ns(wt), data = mtcars)
#> wt
#> -5.344
margins(lm(mpg ~ splines::ns(wt, 5), data = mtcars))
#> Average marginal effects
#> lm(formula = mpg ~ splines::ns(wt, 5), data = mtcars)
#> wt
#> -5.931
Created on 2021-04-22 by the reprex package (v1.0.0)
By the way - however perhaps this is not the best place to talk about it - rather unexpected consequence of margins()
reporting effects for variables on their original scale is that it is - in some situations - sensitive to change in the data, that was used to estimate the model. See:
library(margins)
# model without within-formula data transformations
m1 <- lm(mpg ~ wt, data = mtcars)
# model with some within-formula data transformation
m2 <- lm(mpg ~ I(wt^2), data = mtcars)
margins(m1)
#> Average marginal effects
#> lm(formula = mpg ~ wt, data = mtcars)
#> wt
#> -5.344
margins(m2)
#> Average marginal effects
#> lm(formula = mpg ~ I(wt^2), data = mtcars)
#> wt
#> -4.542
# now let's change the dataset
mtcars <- subset(mtcars, carb < 4)
# and let's call margins() on the (unaffected!) models once more
# for model with no within-formula data transformations we get the same results
margins(m1)
#> Average marginal effects
#> lm(formula = mpg ~ wt, data = mtcars)
#> wt
#> -5.344
# but for model with some within-formula data transformations results are different!
margins(m2)
#> Average marginal effects
#> lm(formula = mpg ~ I(wt^2), data = mtcars)
#> wt
#> -4.068
Created on 2021-04-22 by the reprex package (v2.0.0)
That's because in the latter model there is no way to get the original variable other than turn back to the dataset (there is no untransformed data in the a model object) and consequently margins()
works on the dataset in its current state. On the other hand, with no within-formula data transformation margins()
can rely on a model frame that is stored in a model object itself to perform all the computations, so these results stands unaffected.
That's an interesting find, and I think it should be included in the documentation somewhere, if it isn't yet.
For the specific case you describe (subsetting), margins could even figure out that the data was changed, because the design matrix has fewer rows than the updated mtcars
data. I think that margins should throw an error in that case. Of course, it would not be possible to detect other changes in the original data.
I think it would be worth creating a new issue for this problem.
It should be possible to detect any changes (that affect results) by comparing model frame from the model object with model frame constructed using data object in its current state and the model call. However detecting such changes perhaps should result in warning instead of error.
Yes, that would be a pretty expensive operation for large models, though, and you would have to do it for every margins()
call. I definitely think that this should be addressed in some way.
Just adding a comment here, because I was confused by this issue. In my testing, all of these transformations are reported on the scale of the original predictor (which is also what I would have expected): ...
Thanks for the answer! I realize that I misunderstood the initial answer. I thought that the intention was to ignore usage of I() or any other transformation. It is however not ignored, just the final reporting is then in terms of the original variable (which makes sense).
Using
I()
inside a formula does not work,margins()
ignores it. As an example, let's do a linear regression of mpg vs. hp from the mtcars data and re-scale hp.Created on 2020-06-21 by the reprex package (v0.3.0)
Session info
``` r sessionInfo() #> R version 4.0.0 (2020-04-24) #> Platform: x86_64-apple-darwin17.0 (64-bit) #> Running under: macOS High Sierra 10.13.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib #> #> locale: #> [1] de_DE.UTF-8/de_DE.UTF-8/de_DE.UTF-8/C/de_DE.UTF-8/de_DE.UTF-8 #> #> attached base packages: #> [1] stats graphics grDevices utils datasets methods base #> #> other attached packages: #> [1] margins_0.3.23 #> #> loaded via a namespace (and not attached): #> [1] Rcpp_1.0.4.6 digest_0.6.25 MASS_7.3-51.6 magrittr_1.5 #> [5] evaluate_0.14 highr_0.8 rlang_0.4.6 stringi_1.4.6 #> [9] data.table_1.12.8 rmarkdown_2.1 prediction_0.3.14 tools_4.0.0 #> [13] stringr_1.4.0 xfun_0.14 yaml_2.2.1 compiler_4.0.0 #> [17] htmltools_0.4.0 knitr_1.28 ```