Closed Larissa-Cury closed 1 year ago
As is documented,
This function uses calls to regrid to put the estimated marginal means (EMMs) on the log scale. Then an extra element is added to this grid for the log of sigma and its standard error (where we assume that sigma is uncorrelated with the log EMMs). Then a call to contrast subtracts log{sigma} from each of the log EMMs, yielding values of log(EMM/sigma). Finally, the results are re-gridded back to the original scale and the desired contrasts are computed using method. In the log-scaling part, we actually rescale the absolute values and keep track of the signs.
Note that these results are pairwise comparisons of the form (EMM[i] - EMM[j]) / sigma -- that is, Cohen's d. See also this vignette section
Regarding the side question, he first example in the help page shows how to use it after a call to pairs()
:
fiber.lm <- lm(strength ~ diameter + machine, data = fiber)
emm <- emmeans(fiber.lm, "machine")
eff_size(emm, sigma = sigma(fiber.lm), edf = df.residual(fiber.lm))
# or equivalently:
eff_size(pairs(emm), sigma(fiber.lm), df.residual(fiber.lm), method = "identity")
Personally, I am not a fan of effect sizes, in part because they are ambiguously defined (or perhaps not defined at all) for anything other than a simple one-treatment model with no random effects. In fact, I saw a blog entry once by Lance Waller at Emory that suggests that's exactly what he recommends:
simp <- lm(CONT_Y ~ MY_GROUP * YEAR), data = dfModels)
eff_size(simp, sigma = sigma(simp), edf = df.residual(simpEMM, by = "MY_GROUP"))
eff_size(simp, sigma = sigma(simp), edf = df.residual(simpEMM, by = "YEAR"))
I'm quite sure you have both edf
and sigma
wrong if you insist on using the mixed model. The edf
should be no larger than the smallest d.f. from pairs()
using the K-R degrees of freedom. For sigma
, you should probably use the total SD, which combines the error and subject SDs. Look at VarCorr(mod1)
and then compute sigma = sqrt((error SD)^2 + (subject SD)^2)
VarCorr(mod1)
and then computesigma = sqrt((error SD)^2 + (subject SD)^2)
Thank you very much for your kind reply and suggestions, it really helps me since this is my first model to be actually published. All right, I got the sigma
from VarCorr(mod1)
, but I'm not so sure I got "The edf should be no larger than the smallest d.f. from pairs() using the K-R degrees of freedom". This is my output:
> emmeans(mod1,~MY_GROUP|YEAR) %>% pairs(adjust="Tukey")
YEAR = 2020:
contrast estimate SE df t.ratio p.value
G1 - G2 -0.915 0.53 60 -1.726 0.0894
YEAR = 2021:
contrast estimate SE df t.ratio p.value
G1 - G2 -0.312 0.53 60 -0.590 0.5577
Degrees-of-freedom method: kenward-roger
> emmeans(mod1,~YEAR|LMY_GROUP) %>% pairs(adjust="Tukey")
MY_GROUP = G1:
contrast estimate SE df t.ratio p.value
YEAR2020 - YEAR2021 -1.144 0.53 60 -2.159 0.0349
MY_GROUP = G2:
contrast estimate SE df t.ratio p.value
YEAR2020 - YEAR2021 -0.541 0.53 60 -1.022 0.3110
Degrees-of-freedom method: kenward-roger
The df equals 60. Is there a way I can get the smallest df to pass it to edf
, then?
(I have to insist on the lmer because data comes from the same students, hence I need (1|Participant)
If all the df are 60, then use 60 for the edf.
The edf should be no larger than the smallest d.f. from pairs() using the K-R degrees of freedom
I guess that now I got it! Thank you again! So by "The edf should be no larger than the smallest d.f. from pairs() using the K-R degrees of freedom" you meant that we use the smaller d.f. in the pairwise comparisons, right? Yes, in my case, all dfs are the same, but If I had a smaller one, then I should go with the smaller, right?
### the model is:
mod1 <- lmer(CONT_Y ~ MY_GROUP * YEAR + (1|ID), data = dfModels)
### estimate eemmeans:
group <- emmeans(mod1,~ MY_GROUP|YEAR)
year <- emmeans(mod1,~ YEAR|MY_GROUP)
### pairwise comparisons:
group_p <- emmeans(mod1,~ MY_GROUP|YEAR) %>% pairs(adjust="Tukey")
year_p <- emmeans(mod1,~ YEAR|MY_GROUP) %>% pairs(adjust="Tukey")
### correctiong sigma and edf
sigmaValues <- VarCorr(mod1)
sigmaValues
sigma <- sqrt((0.25743)^2 + (0.15054)^2)
### calculate Cohen's d:
# eff1 <- eff_size(emm1, sigma = sigma(mod1), edf = df.residual(mod1)) ### before:
group_p ### check the lowest df
eff1 <- eff_size(group, sigma = sigma, edf = 60) ## and for group
year_p ### check the lowest df
eff2 <- eff_size(year, sigma = sigma, edf = 60) ## and for year
Looking in the documentation, I didn't see how to adapt:
simp <- lm(CONT_Y ~ MY_GROUP * YEAR), data = dfModels)
eff_size(pairs(emm), sigma(fiber.lm), df.residual(fiber.lm), method = "identity")
to a lmer
so tha'ts why I've performed all steps above. Is there also a straightforward path with pairs
to a mixed model?
I think this is resolved, (or agree on how it isn't resolved), so am closing.
I'm wondering what is
eff_size()
actually computing when I use it to calculate effect sizes of my lmer:The documentation says "Most popular is probably Cohen’s d, which is defined as the observed difference, divided by the population SD; and obviously Cohen effect sizes are close cousins of pairwise differences. They are available via the eff_size() function, where the user must specify the emmGrid object with the means to be compared". Ok, from this I get that it's Cohen's D. However, it's not mentioned on the function's doc
I've also seen elsewhere that, using the syntax above, it would be actually calculating partial eta. So this issue is for some clarification on this topic, I'd like to thank in advance. I'm wondering if it will always return Cohen's d or if, given a different syntax such as mine, it would calculate another estimate.
side question: why won't
eff_size()
work on the output ofemmeans(mod1,~YEAR|MY_GROUP) %>% pairs(adjust="Tukey")
?