Closed Deleetdk closed 4 years ago
Here's a comparison with car::Anova()
results, which as far as I can tell from Harell's reply, should be the same for the simple cases without interaction. I updated the package to latest github before running this (https://github.com/strengejacke/sjstats/commit/c90f49ca725a026c79042a82ebf9b185aa85273a).
> library(pacman)
> p_load(rms, sjstats)
>
> ols(Sepal.Width ~ Petal.Width, data = iris) %>%
+ anova() %>%
+ omega_sq(ci.lvl = .95)
# A tibble: 2 x 4
term omegasq conf.low conf.high
<chr> <dbl> <dbl> <dbl>
1 Petal.Width 0.112 0.0475 0.235
2 TOTAL 0.112 0.0475 0.235
>
> lm(Sepal.Width ~ Petal.Width, data = iris) %>%
+ car::Anova() %>%
+ omega_sq(ci.lvl = .95)
# A tibble: 1 x 4
term omegasq conf.low conf.high
<chr> <dbl> <dbl> <dbl>
1 Petal.Width 0.127 0.0478 0.236
>
> #two predictors
> ols(Sepal.Width ~ Petal.Width + Petal.Length, data = iris) %>%
+ anova() %>%
+ omega_sq(ci.lvl = .95)
# A tibble: 3 x 4
term omegasq conf.low conf.high
<chr> <dbl> <dbl> <dbl>
1 Petal.Width 0.0217 0.000514 0.110
2 Petal.Length 0.0662 0.0215 0.184
3 TOTAL 0.182 0.100 0.313
>
> lm(Sepal.Width ~ Petal.Width + Petal.Length, data = iris) %>%
+ car::Anova() %>%
+ omega_sq(ci.lvl = .95)
# A tibble: 2 x 4
term omegasq conf.low conf.high
<chr> <dbl> <dbl> <dbl>
1 Petal.Width 0.0269 0.000521 0.111
2 Petal.Length 0.0818 0.0218 0.186
>
> #two predictors w interaction
> ols(Sepal.Width ~ Petal.Width * Petal.Length, data = iris) %>%
+ anova() %>%
+ omega_sq(ci.lvl = .95)
# A tibble: 6 x 4
term omegasq conf.low conf.high
<chr> <dbl> <dbl> <dbl>
1 Petal.Width (Factor+Higher Order Factors) 0.106 0.156 0.375
2 " All Interactions" 0.0936 0.137 0.352
3 Petal.Length (Factor+Higher Order Factors) 0.129 0.193 0.413
4 " All Interactions" 0.0936 0.137 0.352
5 Petal.Width * Petal.Length (Factor+Higher Order Factors) 0.0936 0.137 0.352
6 TOTAL 0.190 0.279 0.494
>
> lm(Sepal.Width ~ Petal.Width * Petal.Length, data = iris) %>%
+ car::Anova() %>%
+ omega_sq(ci.lvl = .95)
# A tibble: 3 x 4
term omegasq conf.low conf.high
<chr> <dbl> <dbl> <dbl>
1 Petal.Width 0.0284 0.00347 0.128
2 Petal.Length 0.0834 0.0374 0.218
3 Petal.Width:Petal.Length 0.221 0.143 0.363
As can be seen, the car::Anova()
results don't match exactly even for simple 1 predictor model.
The functions in sjstats rely on the tidy output of the related anova()
functions, so the differences should not be due to computations in sjstats, but rather on the results from the anova()
-call from those packages...
I'll look into this issue the next days...
@strengejacke Any news on this? My study is stuck until this can be resolved. Perhaps I will need to switch to linear terms (no rcs
for the purpose of computing the relative variable metrics) which would be a little annoying.
I would be happy to give a bounty for the resolvement of the issue, say, 100 USD.
You'll need to do the matrix operations on the anova() output to create a suitable object.
Any suggestions on how to implement this or even a PR are welcome! :-)
The matrix operations are very easy. I routinely add and drop rows from anova
output. I think there is an example you can see when you type ?anova.rms
. Putting this into the tidy
workflow is up to users such as yourself, since I don't use tidy
.
@Deleetdk Does this help you? I have not much experience with Anova methods, as I usually don't use them. I would leave it to users to propose a fix for this issue. At the moment I'm not exactly sure how to address this issue and I don't know when I will find the time to investigate further.
Please don't say "fix" as nothing is broken about this in the rms
package. It is an issue of developing the right interface between rms
and the tidyverse, for anyone motivated to do so. I am not motivated, because your approach leaves out important information from rms::anova
.
Sorry if this was confusing, by "fix" I meant this issue, i.e. revise my code so that it also works with more complex functions like rms.anova
. I think this is not specifically linked to the tidyverse, it's that my function returns a data frame that is tidy in the sense that its structure is always the same for all input.
And, of course, I was not specifically asking you to do it.
Thanks very much for that.
@strengejacke Any progress on this? My bounty from before still stands, and I'm waiting for this issue. I'll post a public tweet and see if someone will submit a PR.
No, sorry. This currently has lower priority for me as I'm not so familiar with ANOVA and how to properly process the more complex objects returned by rms.
closing in favor of https://github.com/easystats/effectsize/issues/67
Per https://github.com/strengejacke/sjstats/issues/21, there is now support for rms models. Hurray! Example with linear model:
However, the output can be odd when there are nonlinear because rms inserts these as extra rows:
Note the presence of missing values. And interactions:
It gets very messy with both:
The output here seems impossible. Is it because
omega_sq
interprets every row as being a predictor thus causing double counting and impossible values? A solution here is to exclude rows based on regex matching of the extra lines, though in rare cases with oddly named terms (one could use a term calledNonlinear
, notice the initial space), this could result in errors, but it seems unlikely to be a practical problem.Total omega² vs. adj. R² vs. R²
I notice that the numbers do not match:
Model 1 (above). R2 = .132, adj. R2 = .128, omega² = .112. There are no other predictors, so the variance should be allocated to the only predictor there is. Note however that the output from
anova()
call has an odd-looking row, which looks like it is just the intercept:Second model: R² = .440, adj. R² = .408, omega² = .314. Third model: R² = .416, adj. R² = .404, omega² = .190. Fourth model: R² = .431, adj. R² = .406, omega² = .237.
I also note that
eta_sq()
output does not match either R² or R² adj. either.