easystats / report

:scroll: :tada: Automated reporting of objects in R
https://easystats.github.io/report/
Other
691 stars 69 forks source link

report() assigning effect size to intercept in model #451

Open RenyBB opened 1 month ago

RenyBB commented 1 month ago

For example, we want to do a type III ANOVA, so we fit a linear model with categorical predictors and use the car::Anova function:

some_linear_model <- lm(mpg ~ as.factor(cyl)*as.factor(am), data=mtcars) 
some_anova <- car::Anova(some_linear_model, type = "III")

Then, we use report() and report_table() to output the results:

report::report(some_anova)
report::report_table(some_anova)

The effect sizes using repor_tablet() are correct, but the effect sizes using report() don't match up with the correct effects:

Compare these to the results obtained with report_table(): image

mattansb commented 1 month ago

It seems silly, but for type 3 ANOVA tables we do get the intercept term, and it does have a meaning: It is the proportional reduction in error accounted for by the inclusion of the intercept. So in a sense, this is the "variance explained" by the intercept:


library(performance)

m0 <- lm(mpg ~ 0, data = mtcars)
m1 <- lm(mpg ~ 1, data = mtcars)

car::Anova(m1, type = 3)
#> Anova Table (Type III tests)
#> 
#> Response: mpg
#>             Sum Sq Df F value    Pr(>F)    
#> (Intercept)  12916  1  355.58 < 2.2e-16 ***
#> Residuals     1126 31                      
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

effectsize::F_to_eta2(f = 355.58, df = 1, df_error = 31, ci = NULL) |> 
  print(digits = 6)
#> Eta2 (partial)
#> --------------
#> 0.919810 

1 - (rmse(m1) ^ 2) / (rmse(m0) ^ 2)
#> [1] 0.9198104
RenyBB commented 1 month ago

Sorry, I can see how the title I chose is completely uninformative. I've edited the post to describe the issue in more detail - the values for the effect sizes don't match the correct effects when using report(). For example, the effect size for the interaction term should be 0.1 but report() returns 0.41.

mattansb commented 1 month ago

Ah, I see.

@IndrajeetPatil @rempsyc wasn't this recycling issue resolved in #198 ?

rempsyc commented 1 month ago

The effect sizes are misaligned (probably because it is NA for the intercept instead of an empty string). Reprex:

packageVersion("report")
#> [1] '0.5.8.5'

some_linear_model <- lm(mpg ~ as.factor(cyl)*as.factor(am), data=mtcars) 
some_anova <- car::Anova(some_linear_model, type = "III")

report::report(some_anova)
#> Type 3 ANOVAs only give sensible and informative results when covariates
#>   are mean-centered and factors are coded with orthogonal contrasts (such
#>   as those produced by `contr.sum`, `contr.poly`, or `contr.helmert`, but
#>   *not* by the default `contr.treatment`).
#> The ANOVA suggests that:
#> 
#>   - The main effect of (Intercept) is statistically significant and large (F(1,
#> 26) = 171.10, p < .001; Eta2 (partial) = 0.41, 95% CI [0.15, 1.00])
#>   - The main effect of as.factor(cyl) is statistically significant and large
#> (F(2, 26) = 9.12, p < .001; Eta2 (partial) = 0.20, 95% CI [0.02, 1.00])
#>   - The main effect of as.factor(am) is statistically significant and medium
#> (F(1, 26) = 6.35, p = 0.018; Eta2 (partial) = 0.10, 95% CI [0.00, 1.00])
#>   - The interaction between as.factor(cyl) and as.factor(am) is statistically not
#> significant and large (F(2, 26) = 1.38, p = 0.269; Eta2 (partial) = 0.41, 95%
#> CI [0.15, 1.00])
#> 
#> Effect sizes were labelled following Field's (2013) recommendations.
report::report_table(some_anova)
#> Type 3 ANOVAs only give sensible and informative results when covariates
#>   are mean-centered and factors are coded with orthogonal contrasts (such
#>   as those produced by `contr.sum`, `contr.poly`, or `contr.helmert`, but
#>   *not* by the default `contr.treatment`).
#> Parameter                    | Sum_Squares | df | Mean_Square |      F |      p | Eta2 (partial) | Eta2_partial 95% CI
#> ----------------------------------------------------------------------------------------------------------------------
#> (Intercept)                  |     1573.23 |  1 |     1573.23 | 171.10 | < .001 |                |                    
#> as.factor(cyl)               |      167.71 |  2 |       83.85 |   9.12 | < .001 |           0.41 |        [0.15, 1.00]
#> as.factor(am)                |       58.43 |  1 |       58.43 |   6.35 | 0.018  |           0.20 |        [0.02, 1.00]
#> as.factor(cyl):as.factor(am) |       25.44 |  2 |       12.72 |   1.38 | 0.269  |           0.10 |        [0.00, 1.00]
#> Residuals                    |      239.06 | 26 |        9.19 |        |        |                |

Created on 2024-07-10 with reprex v2.1.1

So yes, just like in #198, it seems like indeed it wasn't properly fixed since we have the same issue with the old example:

packageVersion("report")
#> [1] '0.5.8.5'

m <- lm(mpg ~ factor(am) * factor(cyl), mtcars)
a <- car::Anova(m, type = 3)

report::report(a)
#> Type 3 ANOVAs only give sensible and informative results when covariates
#>   are mean-centered and factors are coded with orthogonal contrasts (such
#>   as those produced by `contr.sum`, `contr.poly`, or `contr.helmert`, but
#>   *not* by the default `contr.treatment`).
#> The ANOVA suggests that:
#> 
#>   - The main effect of (Intercept) is statistically significant and large (F(1,
#> 26) = 171.10, p < .001; Eta2 (partial) = 0.20, 95% CI [0.02, 1.00])
#>   - The main effect of factor(am) is statistically significant and large (F(1,
#> 26) = 6.35, p = 0.018; Eta2 (partial) = 0.41, 95% CI [0.15, 1.00])
#>   - The main effect of factor(cyl) is statistically significant and medium (F(2,
#> 26) = 9.12, p < .001; Eta2 (partial) = 0.10, 95% CI [0.00, 1.00])
#>   - The interaction between factor(am) and factor(cyl) is statistically not
#> significant and large (F(2, 26) = 1.38, p = 0.269; Eta2 (partial) = 0.20, 95%
#> CI [0.02, 1.00])
#> 
#> Effect sizes were labelled following Field's (2013) recommendations.
report::report_table(a)
#> Type 3 ANOVAs only give sensible and informative results when covariates
#>   are mean-centered and factors are coded with orthogonal contrasts (such
#>   as those produced by `contr.sum`, `contr.poly`, or `contr.helmert`, but
#>   *not* by the default `contr.treatment`).
#> Parameter              | Sum_Squares | df | Mean_Square |      F |      p | Eta2 (partial) | Eta2_partial 95% CI
#> ----------------------------------------------------------------------------------------------------------------
#> (Intercept)            |     1573.23 |  1 |     1573.23 | 171.10 | < .001 |                |                    
#> factor(am)             |       58.43 |  1 |       58.43 |   6.35 | 0.018  |           0.20 |        [0.02, 1.00]
#> factor(cyl)            |      167.71 |  2 |       83.85 |   9.12 | < .001 |           0.41 |        [0.15, 1.00]
#> factor(am):factor(cyl) |       25.44 |  2 |       12.72 |   1.38 | 0.269  |           0.10 |        [0.00, 1.00]
#> Residuals              |      239.06 | 26 |        9.19 |        |        |                |

Created on 2024-07-10 with reprex v2.1.1