amices / mice

Multivariate Imputation by Chained Equations
https://amices.org/mice/
GNU General Public License v2.0
431 stars 107 forks source link

Question: Manually pooling parameters from RE model/Degrees of freedom #483

Open ajw11 opened 2 years ago

ajw11 commented 2 years ago

Does the automatic pool() function count a random effect (e.g., a random intercept) as a parameter in calculating the degrees of freedom for a mixed model? I ask because I would like to manually pool using pool.scalar but I'm not sure if this should be counted under the "k" parameters. Also, if using pool.scalar for a multiple regression problem, you should include "k=# of total parameters in model" in calculating each pooled coefficient, correct? E.g.,

For a model with 5 parameters (4 covariates plus intercept)

Pooling 1st coefficient estimate

pool.scalar(Q, U, n = 50, k = 5)

Pooling 2nd coefficient estimate

pool.scalar(Q, U, n = 50, k = 5)

etc...

hanneoberman commented 2 years ago

This doesn't answer your question, but have you tried using the mitml package to pool the coefficients of your random effects model? See e.g. https://stefvanbuuren.name/fimd/sec-mlguidelines.html.

A workflow where imp is your mids object, model is the random effects model, and fit is the resulting mira object might look like:

library(mice)
library(lme4)
library(mitml)
fit <- with(imp, lmer(model))
testEstimates(as.mitml.result(fit), extra.pars = TRUE)
ajw11 commented 2 years ago

Thank you for the reference. I’m pooling marginalized coefficients from glmmadaptive package, so I need to use pool.scalar, else that workflow would be fine. Whether the RE should still be counted as parameter after marginalization is another issue I need to look into, but I thought knowing what the standard workflow does would be good.

stefvanbuuren commented 2 years ago

Note: You will need to install the broom.mixed package to pool results from multilevel analyses.

martinschlund commented 2 years ago

Hi! I have a very related problem. I am fitting a model with the package lcmm that does not fit into the standard MICE workflow partly because it does not accept the with()-approach and partly because there is no tidy()-method for models of class lcmm. So, I will have to pool the results manually, which I try to do with the following workflow. Please notice that I use the tdc data because it replicates my dataset well and that I am aware that a raw call to mice in the tdc data will result imprecise imputations. For this I am only interested in the method for pooling the results:

library(mice)
library(lcmm)
library(tidyverse)

data("tbc")

#this is an example of the model I want to fit

m2 <- lcmm(bmi.z ~ age + sex, random = ~ age, subject = "id", data = tbc, link = "splines")  

imp <- mice(tbc, maxit = 35)

com <- complete(imp, "long")

#fits separate models to each imputed dataset, stores in list:

modellist <- list()

for(i in 1:length(unique(com$.imp))){
  data = filter(com, .imp == i)
  modellist[[i]] = lcmm(bmi.z ~ age + sex, random = ~ age, subject = "id", data = data, link = "splines")
}

#defines tidy function for lcmm:

tidy.lcmmm <- function(x, conf.int = FALSE, conf.level = 0.95, ...) {

  result <- summary(x) %>%
    tibble::as_tibble(rownames = "term") %>%
    dplyr::rename(estimate = coef,
                  std.error = Se,
                  statistic = Wald,
                  p.value = `p-value`)

  if (conf.int) {
    ci <- confint(x, level = conf.level)
    result <- dplyr::left_join(result, ci, by = "term")
  }

  result
}

#tidies results: 

df <- tibble(n = 1:5) %>% 
  mutate(models = map(modellist, ~tidy.lcmmm(.x)), 
         models = map(models, ~as_tibble(.x)))

df <- df %>% 
  unnest(cols = c(models))

#extracts and pools results for age: 

estimates_age <- filter(df, term == "age") %>% 
  select(estimate, std.error)%>% 
  mutate(var = std.error^2)

pooled.results <- pool.scalar(Q = estimates_age$estimate, U = estimates_age$var, n = 306, k = 2)

I wonder if this seems like the correct way to use pool.scalar, and specifically, I wonder about k, as the model really seems to be fitting many more parameters "behind the scenes".

> estimates(m2)
        age         sex  cholesky 1  cholesky 2  cholesky 3  I-splines1  I-splines2  I-splines3  I-splines4  I-splines5  I-splines6 
 0.01815061 -0.01449388 -1.03820628  0.03098324  0.08742982 -4.89337146  0.61847877 -0.61525249  2.08221454  1.65755067  1.70915255 
 I-splines7 
 0.54300731 

And finally, thank you for a great package!

hanneoberman commented 11 months ago

@martinschlund the new mice function pool.table() might be what you need!