philchalmers / mirt

Multidimensional item response theory
https://philchalmers.github.io/mirt/
201 stars 75 forks source link

multipleGroup and missing data #216

Closed acircleda closed 2 years ago

acircleda commented 2 years ago

Hi Phil. I love your package and am using it for my dissertation.

I am wondering how multipleGroup handles missing data? Specifically, if some groups are not administered an item, I would assume multipleGroup drops those items from the group-level analysis. However, in running the code below, that group (D) has item parameters for items that were all NAs. For content, I am re-estimating PISA data and for some countries, some items were not included. Additionally, I am doing this across cycles (I have cycle-by-country groups), some items were not asked (only trend items persist and unique items change cycle to cycle).

library(mirt)
library(tidyverse)

# simulate data ----
set.seed(1234)
N <- 1000

# create covariates ----
X1 <- rnorm(N); X2 <- rnorm(N)
Theta <- matrix(0.5 * X1 + -1 * X2 + rnorm(N, sd = 0.5))

# create items and response data ----
a <- matrix(1, 20); d <- matrix(rnorm(20))
dat <- simdata(a, d, 1000, itemtype = '2PL', Theta=Theta)

# create groups ----
groups <- c(rep("A", 250),rep("B", 250),rep("C", 250),rep("D", 250))

# generate missing values in item data ----
# group A - 50 missing values
dat_a <- dat %>% as.data.frame() %>%
  slice(1:250)

for (i in 1:50) {
  dat_a[sample(nrow(dat_a),1),sample(ncol(dat_a),1)] <- NA
}

# group B - complete cases
dat_b <- dat %>% as.data.frame() %>%
  slice(251:500)

#group C - 100 missing values

dat_c <- dat %>% as.data.frame() %>%
  slice(501:750)

for (i in 1:100) {
  dat_c[sample(nrow(dat_c),1),sample(ncol(dat_c),1)] <- NA
}

# group d - 5 missing columns
dat_d<- dat %>% as.data.frame() %>%
  slice(751:1000) %>%
  mutate_at(vars(5,10,15,20), ~NA)

# combine data

dat2 <- rbind(dat_a, dat_b, dat_c, dat_d)

# model with missing item data
mod0 <- multipleGroup(dat2, 1, group=groups,'2PL',
                      invariance = c("slopes", "intercepts", "free_mean", "free_var"))
summary(mod0)
philchalmers commented 2 years ago

Answered on the mirt-package forum (definitely not an issue either).