philchalmers / mirt

Multidimensional item response theory
https://philchalmers.github.io/mirt/
201 stars 75 forks source link

fscores(): subscript out of bounds #249

Closed Deleetdk closed 6 months ago

Deleetdk commented 6 months ago

I ran into this odd bug. I have a model with 32 items with an ordinal type using itemtype = graded. After doing DIF testing, I get this error when trying to score cases using anchor items:

(code is part of a longer function)

  #fit together without DIF
  if (messages) message("\nStep 4: Fit without DIF items, liberal threshold")
  mirt_fit_noDIF_liberal = rlang::exec(mirt::mirt, model = mirt::mirt.model(model_noDIF_liberal_Q), !!!mirt_args_set2)
  # mirt_fit_noDIF_liberal = mirt::mirt(items, model = mirt::mirt.model(model_noDIF_liberal_Q), method = method, technical = technical, verbose = messages, itemtype = itemtype)
  if (messages) message("\nStep 5: Fit without DIF items, conservative threshold")
  mirt_fit_noDIF_conservative = rlang::exec(mirt::mirt, model = mirt::mirt.model(model_noDIF_conservative_Q), !!!mirt_args_set2)
  # mirt_fit_noDIF_conservative = mirt::mirt(items, model = mirt::mirt.model(model_noDIF_conservative_Q), method = method, technical = technical, verbose = messages, itemtype = itemtype)

  #with anchors
  if (messages) message("\nStep 6: Fit with anchor items, liberal threshold")
  mirt_fit_anchors_liberal = rlang::exec(mirt::multipleGroup, !!!mirt_args, group = group, invariance = c(items_noDIF_liberal %>% names(), 'free_means', 'free_var'))
  # mirt_fit_anchors_liberal = mirt::multipleGroup(items, model = model, group = group, invariance = c(items_noDIF_liberal %>% names(), 'free_means', 'free_var'), method = method, technical = technical, verbose = messages, itemtype = itemtype)
  if (messages) message("\nStep 7: Fit with anchor items, conservative threshold")
  mirt_fit_anchors_conservative = rlang::exec(mirt::multipleGroup, !!!mirt_args, group = group, invariance = c(items_noDIF_conservative %>% names(), 'free_means', 'free_var'))
  # mirt_fit_anchors_conservative = mirt::multipleGroup(items, model = model, group = group, invariance = c(items_noDIF_conservative %>% names(), 'free_means', 'free_var'), method = method, technical = technical, verbose = messages, itemtype = itemtype)

  #get scores
  if (messages) message("\nStep 8: Get scores")
  browser()
  orig_scores = do.call(what = mirt::fscores, args = c(list(object = mirt_fit), fscores_pars))
  noDIF_scores_liberal = do.call(what = mirt::fscores, args = c(list(object = mirt_fit_noDIF_liberal), fscores_pars))
  noDIF_scores_conservative = do.call(what = mirt::fscores, args = c(list(object = mirt_fit_noDIF_conservative), fscores_pars))
  anchor_scores_liberal = do.call(what = mirt::fscores, args = c(list(object = mirt_fit_anchors_liberal), fscores_pars))
  anchor_scores_conservative = do.call(what = mirt::fscores, args = c(list(object = mirt_fit_anchors_conservative), fscores_pars))

Where:

Browse[1]> fscores(mirt_fit_anchors_liberal)
Error in `[<-`(`*tmp*`, completely_missing, , value = NA) : 
  subscript out of bounds
Browse[1]> fscores(mirt_fit_anchors_conservative)
Error in `[<-`(`*tmp*`, completely_missing, , value = NA) : 
  subscript out of bounds

There's nothing important in fscores_pars and the same error occurs without these additional arguments.

The difference between liberal and conservative is only whether the p-value used for excluding items was corrected for multiple testing or not.

The 2 models without DIF items, and those with anchored items both converge without problems. The no-DIF fits work with fscores() without problems, but the anchored models fail to do so with the above error. I traced the error into fscores.internal() but not really sure what the deeper cause is.

Interestingly, mirt::empirical_ES() still works on the anchored fits, though I would assume this implies calls to fscores() as well. Looking at the function, I see that it uses the argument leave_missing = T. This works when used with the anchor fits. So I think this problem has something to do with a minor error in handling missing data, probably a few rows with completely empty data:

Browse[1]> fscores(mirt_fit_anchors_liberal, leave_missing = F) %>% dim()
Error in `[<-`(`*tmp*`, completely_missing, , value = NA) : 
  subscript out of bounds
Browse[1]> fscores(mirt_fit_anchors_liberal, leave_missing = T) %>% dim()
[1] 5847    1
Browse[1]> fscores(mirt_fit_noDIF_liberal, leave_missing = T) %>% dim()
[1] 5847    1
Browse[1]> fscores(mirt_fit_noDIF_liberal, leave_missing = F) %>% dim()
[1] 5852    1

I think you should be able to reproduce this error just using the fitted objects, so I have exported them here.

debug_fits.zip

philchalmers commented 6 months ago

Thanks, but the code/objects you're uploading don't allow me to reproduce the issue as there's too much missing. Could you modify the following until it works as a standalone reprex?

library(mirt)
obj <- readRDS(file.choose())
SimDesign::Attach(obj)

    #fit together without DIF
if (messages) message("\nStep 4: Fit without DIF items, liberal threshold")
mirt_fit_noDIF_liberal = rlang::exec(mirt::mirt, model = mirt::mirt.model(model_noDIF_liberal_Q), !!!mirt_args_set2)
# mirt_fit_noDIF_liberal = mirt::mirt(items, model = mirt::mirt.model(model_noDIF_liberal_Q), method = method, technical = technical, verbose = messages, itemtype = itemtype)
if (messages) message("\nStep 5: Fit without DIF items, conservative threshold")
mirt_fit_noDIF_conservative = rlang::exec(mirt::mirt, model = mirt::mirt.model(model_noDIF_conservative_Q), !!!mirt_args_set2)
# mirt_fit_noDIF_conservative = mirt::mirt(items, model = mirt::mirt.model(model_noDIF_conservative_Q), method = method, technical = technical, verbose = messages, itemtype = itemtype)

#with anchors
if (messages) message("\nStep 6: Fit with anchor items, liberal threshold")
mirt_fit_anchors_liberal = rlang::exec(mirt::multipleGroup, !!!mirt_args, group = group, invariance = c(items_noDIF_liberal %>% names(), 'free_means', 'free_var'))
# mirt_fit_anchors_liberal = mirt::multipleGroup(items, model = model, group = group, invariance = c(items_noDIF_liberal %>% names(), 'free_means', 'free_var'), method = method, technical = technical, verbose = messages, itemtype = itemtype)
if (messages) message("\nStep 7: Fit with anchor items, conservative threshold")
mirt_fit_anchors_conservative = rlang::exec(mirt::multipleGroup, !!!mirt_args, group = group, invariance = c(items_noDIF_conservative %>% names(), 'free_means', 'free_var'))
# mirt_fit_anchors_conservative = mirt::multipleGroup(items, model = model, group = group, invariance = c(items_noDIF_conservative %>% names(), 'free_means', 'free_var'), method = method, technical = technical, verbose = messages, itemtype = itemtype)

#get scores
if (messages) message("\nStep 8: Get scores")
browser()
orig_scores = do.call(what = mirt::fscores, args = c(list(object = mirt_fit), fscores_pars))
noDIF_scores_liberal = do.call(what = mirt::fscores, args = c(list(object = mirt_fit_noDIF_liberal), fscores_pars))
noDIF_scores_conservative = do.call(what = mirt::fscores, args = c(list(object = mirt_fit_noDIF_conservative), fscores_pars))
anchor_scores_liberal = do.call(what = mirt::fscores, args = c(list(object = mirt_fit_anchors_liberal), fscores_pars))
anchor_scores_conservative = do.call(what = mirt::fscores, args = c(list(object = mirt_fit_anchors_conservative), fscores_pars))  
Deleetdk commented 6 months ago

One could maybe make a smaller reprex by simulating data, but here I am relying on downloading the file from Github.

library(mirt)
#> Loading required package: stats4
#> Loading required package: lattice
library(tidyverse)

download.file("https://github.com/philchalmers/mirt/files/14645153/debug_fits.zip", destfile = "debug_fits.zip")
system("unzip debug_fits.zip")
fit_objects = read_rds("debug_fits.rds")

#try scoring
fscores(fit_objects$noDIFliberal) %>% dim()
#> [1] 5852    1
fscores(fit_objects$noDIFcons) %>% dim()
#> [1] 5852    1

#these fail
fscores(fit_objects$anchorliberal) %>% dim()
#> Error in `[<-`(`*tmp*`, completely_missing, , value = NA): subscript out of bounds
fscores(fit_objects$anchorcons) %>% dim()
#> Error in `[<-`(`*tmp*`, completely_missing, , value = NA): subscript out of bounds

#but work with added argument
fscores(fit_objects$anchorliberal, leave_missing = T) %>% dim()
#> [1] 5847    1
fscores(fit_objects$anchorcons, leave_missing = T) %>% dim()
#> [1] 5847    1

#not the differences in rows

Created on 2024-03-19 with reprex v2.0.2

philchalmers commented 6 months ago

Thanks, I get the following now:

> #these fail
> fscores(fit_objects$anchorliberal) %>% dim()
[1] 5852    1
> #> Error in `[<-`(`*tmp*`, completely_missing, , value = NA): subscript out of bounds
> fscores(fit_objects$anchorcons) %>% dim()
[1] 5852    1