ibecav / CGPfunctions

Powell Miscellaneous Functions for Teaching and Learning Statistics
Other
27 stars 11 forks source link

Deprecation of sjstats::anova_stats() #42

Open strengejacke opened 3 years ago

strengejacke commented 3 years ago

Hi, I'm maintaining the sjstats package, of which you're using the anova_stats() function in your package. I'm writing you because in the long run, this functions is going to be deprecated first, and then removed at some point in the future.

The reason is that we have started a new project, easystats, where we build new packages from scratch that are focused on particular tasks. In the course of this development, we also refactored some existing packages and re-implemented functions in the easystats ecosystem. The information retrieved by anova_stats() is now available in packages like effectsize or parameters, and these functions are more robust, reliable and consistent.

Thus, could you please update your package and replace anova_stats()? It looks like parameters::model_parameters() gives you all the information you need, including effect sizes for anova tables. It's just that parameters::model_parameters() returns different column names than anova_stats(), so you would have to fix that.

library(CGPfunctions)
library(parameters)
library(sjstats)

mtcars$cyl <- factor(mtcars$cyl)
mtcars$am <- factor(mtcars$am)
mod <- aov(hp ~ cyl * am, data = mtcars)
a <- aovtype2(mod)

anova_stats(a)
#> term      |     sumsq |    meansq | df | statistic | p.value | etasq | partial.etasq | omegasq | partial.omegasq | epsilonsq | cohens.f | power
#> -----------------------------------------------------------------------------------------------------------------------------------------------
#> cyl       | 1.027e+05 | 51364.469 |  2 |    60.164 |  < .001 | 0.711 |         0.822 |   0.695 |           0.787 |     0.699 |    2.151 | 1.000
#> am        |  7317.893 |  7317.893 |  1 |     8.572 |   0.007 | 0.051 |         0.248 |   0.044 |           0.191 |     0.045 |    0.574 | 0.833
#> cyl:am    | 12181.313 |  6090.656 |  2 |     7.134 |   0.003 | 0.084 |         0.354 |   0.072 |           0.277 |     0.073 |    0.741 | 0.930
#> Residuals | 22197.125 |   853.736 | 26 |           |         |       |               |         |                 |           |          |

model_parameters(a, eta_squared = "partial", ci = .9)
#> Parameter | Sum_Squares | df | Mean_Square |     F |      p | Eta2 (partial) |  Eta2 90% CI
#> -------------------------------------------------------------------------------------------
#> cyl       |    1.03e+05 |  2 |    51364.47 | 60.16 | < .001 |           0.82 | [0.70, 0.88]
#> am        |     7317.89 |  1 |     7317.89 |  8.57 | 0.007  |           0.25 | [0.05, 0.46]
#> cyl:am    |    12181.31 |  2 |     6090.66 |  7.13 | 0.003  |           0.35 | [0.10, 0.54]
#> Residuals |    22197.13 | 26 |      853.74 |       |        |                |

model_parameters(a, eta_squared = "raw", ci = .9)
#> Parameter | Sum_Squares | df | Mean_Square |     F |      p | Eta2 |  Eta2 90% CI
#> ---------------------------------------------------------------------------------
#> cyl       |    1.03e+05 |  2 |    51364.47 | 60.16 | < .001 | 0.71 | [0.53, 0.80]
#> am        |     7317.89 |  1 |     7317.89 |  8.57 | 0.007  | 0.05 | [0.00, 0.24]
#> cyl:am    |    12181.31 |  2 |     6090.66 |  7.13 | 0.003  | 0.08 | [0.00, 0.25]
#> Residuals |    22197.13 | 26 |      853.74 |       |        |      |

Created on 2021-01-08 by the reprex package (v0.3.0)

There is no pressure at the moment, as the way to deprecating and removing anova_stats() will take several weeks from now on, I just wanted to point out to this change timely.

Daniel

strengejacke commented 3 years ago

If model_parameters() turns out to not be a full replacement of anova_stats() and there might be some information missing, let me know.

ibecav commented 3 years ago

Thanks Daniel, understand. Thank you for being proactive! I can work with this. Just to make sure I understand you plan on deprecating in a few weeks? How how until removal? I can manage the work it's just all about timing.

strengejacke commented 3 years ago

Well, I'm submitting an update of sjstats to CRAN today (without any changes that affect your package). The next update with deprecation would probably be in about 2-3 month, there is no pressure. However, anova_stats() does not work properly for repeated measurements anova, while the implementation in effectsize / parameters does. I don't want to recode everything from scratch to make anova_stats() work correctly for those "edge cases", so I want to "navigate" users to use effectsize and deprecate anova_stats().

strengejacke commented 3 years ago

If you plan to address this issue later, deprecation and removal in sjstats can also wait longer.

ibecav commented 3 years ago

No problem at all I just needed to know if I had days or weeks to make the adjustments. Your plan is more than reasonable and really not a lot of work for me so just timing.

strengejacke commented 3 years ago

Btw, the next version of parameters (to be release by the end of January / beginning of February) will also be able to add a Power column to the output, just in case you need that information as well.

library(sjstats)
library(parameters)
data(efc)

# fit linear model
fit <- aov(
  c12hour ~ as.factor(e42dep) + as.factor(c172code) + c160age,
  data = efc
)

anova_stats(car::Anova(fit, type = 2))
#> term                |     sumsq |    meansq |  df | statistic | p.value | etasq | partial.etasq | omegasq | partial.omegasq | epsilonsq | cohens.f | power
#> ----------------------------------------------------------------------------------------------------------------------------------------------------------
#> as.factor(e42dep)   | 4.265e+05 | 1.422e+05 |   3 |    80.299 |  < .001 | 0.212 |         0.224 |   0.209 |           0.221 |     0.209 |    0.537 | 1.000
#> as.factor(c172code) |  7352.049 |  3676.025 |   2 |     2.076 |   0.126 | 0.004 |         0.005 |   0.002 |           0.003 |     0.002 |    0.071 | 0.429
#> c160age             | 1.052e+05 | 1.052e+05 |   1 |    59.408 |  < .001 | 0.052 |         0.066 |   0.051 |           0.065 |     0.051 |    0.267 | 1.000
#> Residuals           | 1.476e+06 |  1770.307 | 834 |           |         |       |               |         |                 |           |          |

model_parameters(fit, type = 2, power = TRUE)
#> Parameter           | Sum_Squares |  df | Mean_Square |     F |      p |  Power
#> -------------------------------------------------------------------------------
#> as.factor(e42dep)   |    4.26e+05 |   3 |    1.42e+05 | 80.30 | < .001 | 100.0%
#> as.factor(c172code) |     7352.05 |   2 |     3676.02 |  2.08 | 0.126  |  42.9%
#> c160age             |    1.05e+05 |   1 |    1.05e+05 | 59.41 | < .001 | 100.0%
#> Residuals           |    1.48e+06 | 834 |     1770.31 |       |        |

Created on 2021-01-08 by the reprex package (v0.3.0)