rvlenth / emmeans

Estimated marginal means
https://rvlenth.github.io/emmeans/
358 stars 31 forks source link

Data imputation (MICE) and joint_tests() - does it use D1, D2 or D3? #416

Closed Generalized closed 1 year ago

Generalized commented 1 year ago

Dear Professor,

I'm planning an analysis on multiply imputed data. It will test several contrasts over LS-means over GLM (or GEE). I am also asked to assess the overall impact of the categorical covariates similarly to Type-3 analysis. In presence of interactions, pre-specified simple effects will be explored.

But it's not that easy to run type-2 or type-3 analysis on imputed data in R. I have to either fit nested model, replicate them on imputed datasets and pool via D1 (Wald) or D3 (LRT) statistic. I can also run the entire car::Anova() on each and manually pool the F or Chi2 test statistic via D2 Wald's method.

Then I remembered that emmeans now supports somehow the multiple imputation. While for testing contrasts pooling is not that difficult, as they are approximately normally distributed (sufficient for Rubin's rule), but what about the joint_tests()?

How exactly emmeans works with MICE? Does it run on each imputed dataset and then pools the outcomes? Or should I pool it on my own? What about the chi2 or F statistic from the joint_tests()?

I will look at the sources, but before I will spend long time on reverse engineering it, I thought I could ask the most competent person - the Author :)

Briefly: 1) I want "anova-like" type 3 analysis via joint_tests() and I need to specify how it will work, and what is to be done to obtain pooled results 2) I want to pool contrast analysis and asking if emmeans will pool it for me or should I do it on my own?

rvlenth commented 1 year ago

I don't know what you mean by D1, D2, D3. But joint_tests() does not have anything to do with nested models or any kind of model-reduction tests. It just constructs appropriate linear functions of the regression coefficients and tests them jointly using a Wald test, based on the estimates in object@bhat and covariance matrix in object@V. You can see what linear functions were constructed via attr(jt, "est.fcns") where jt is the object returned by joint_tests().

According to my comments in the code for emmeans:::emm_basis.mira, the bhat slot is the average of the analyses' coefficients, and the V slot is a pooled covariance matrix, based on Rubin's rules. You can look at that code for details.

I think the help page for joint_tests() explains it fairly clearly.