philchalmers / mirt

Multidimensional item response theory
https://philchalmers.github.io/mirt/
201 stars 75 forks source link

"Item re-scored so that all values are within a distance of 1" + 2PLNRM #163

Closed Deleetdk closed 5 years ago

Deleetdk commented 5 years ago

Many times item data come in response option format where the options aren't natural numbers, or may have missing values. In that case, mirt() will recode the data and throw a warning of Item re-scored so that all values are within a distance of 1. However, using most of algorithms for nominal data, this is not a problem, but when using the 2PLNRM (etc.) models, a scoring key is needed as input. However, because of mirt's recoding of the data, the scoring key will generally not fit the data. Thus, the user actually needs to recode the data himself before calling mirt().

There is no particular need for this trap. I suggest modifying the mirt() so that this issue is handled internally. In the meanwhile, here's a small function that does the recoding (which can also be used internally).

#recode items and socring key
#limitation of mirt package requires this
relevel_items = function(x, key) {
  #convert to factor, drop unused levels
  xx = map_df(x, function(xi) {
    xi %>% as.factor() %>% fct_drop()
  })

  #which factor level is the key?
  key2 = map2_int(xx, key, function(i, k) {
    #which level is the right one?
    (levels(i) == k) %>% which()
  })

  list(
    data = map_df(xx, as.numeric),
    key = key2
  )
}

#test
  ex_items = data.frame(
    i1 = c(1, 3, 5, 9, 11) %>% factor(levels = 1:11),
    i2 = c(1, 2, 5, 7, 3) %>% factor(levels = 1:11),
    i3 = c(3, 7, 1, 4, 4) %>% factor(levels = 1:11)
  )
  ex_key = c(1, 7, 3)

  relevel_items(ex_items, ex_key)

Output of test is:

>   relevel_items(ex_items, ex_key)
$data
# A tibble: 5 x 3
     i1    i2    i3
  <dbl> <dbl> <dbl>
1     1     1     2
2     2     2     4
3     3     4     1
4     4     5     3
5     5     3     3

$key
i1 i2 i3 
 1  5  2 

I have tested this solution using the vocabulary test data found here:

https://openpsychometrics.org/_rawdata/

Modified version:

vocab_example.csv.zip

Example analysis:

#load the data and packages first
#mirt binary
vocab_irt_binary = mirt(vocab %>% select(starts_with("Item_")), model = 1, itemtype = "2PL")

#score
vocab_irt_binary_scores = fscores(vocab_irt_binary)
vocab$irt_binary = vocab_irt_binary_scores[, 1] %>% as.numeric() %>% standardize()

#mirt categorical
#must recode items due to issue with mirt
vocab_cat_recoded = relevel_items(vocab %>% select(Q1:Q45),
                                  vocab_keys$true_option)
vocab_irt_cat = mirt(vocab %>% select(Q1:Q45) %>% map_df(as.numeric), model = 1, itemtype = "nominal")
vocab_irt_cat = mirt(vocab_cat_recoded$data, model = 1, itemtype = "graded")
vocab_irt_cat = mirt(vocab_cat_recoded$data, model = 1, itemtype = "gpcm")
vocab_irt_cat = mirt(vocab_cat_recoded$data, model = 1, itemtype = "gpcmIRT")
vocab_irt_cat = mirt(vocab_cat_recoded$data, model = 1, itemtype = "2PLNRM", key = vocab_cat_recoded$key,
                     method = "EM", TOL = .0005)

#score
vocab_irt_cat_scores = fscores(vocab_irt_cat)
vocab$irt_cat = vocab_irt_cat_scores[, 1] %>% as.numeric() %>% standardize()

#criterion relations
vocab %>% select(education, age, simple_score:irt_cat) %>% cor()

In this case, all the nominal scorings of the data are worse than the binary scoring (judging by correlation to criterion variables), except the 2PLNRM which is about the same (r = .99). I was inspired by this paper, but apparently no benefits in this dataset. https://www.mdpi.com/2079-3200/7/3/17

philchalmers commented 5 years ago

Agreed, the scoring key should certainly be adjusted as well whenever the data is modified to model to the package internals. Thanks for bringing this to my attention, I'll take a look at how this can be modified using the existing functions.

philchalmers commented 5 years ago

Patch should fix your observation. I used the following code to test the issue. Thanks for bringing this to my attention!

library(mirt)

data(SAT12)

#correct answer key
key <- c(1,4,5,2,3,1,2,1,3,1,2,4,2,1,5,3,4,4,1,4,3,3,4,1,3,5,1,3,1,5,4,5)
scoredSAT12 <- key2binary(SAT12, key)

mod <- mirt(scoredSAT12, 1)
fs <- fscores(mod)

mod2 <- mirt(SAT12, 1, rep('2PLNRM',32), key=key)
fs2 <- fscores(mod2)
cor(fs, fs2)

SAT12mod <- SAT12
SAT12mod[,1:10] <- SAT12[,1:10] + 2
key2 <- key
key2[1:10] <- key[1:10] + 2

mod3 <- mirt(SAT12mod, 1, rep('2PLNRM',32), key=key2)
anova(mod2, mod3) # should be identical

coef(mod2, simplify=TRUE)
coef(mod3, simplify=TRUE)

# ---------------------------------------------
# reorder some items

SAT12mod <- SAT12
SAT12mod$Item.1 <- ifelse(SAT12$Item.1 == key[1], 0, SAT12$Item.1)
SAT12mod$Item.2 <- ifelse(SAT12$Item.2 == key[2], 10, SAT12$Item.2)

head(SAT12)
head(SAT12mod)

key2 <- key
key2[1:2] <- c(0,10)

mod3 <- mirt(SAT12mod, 1, rep('2PLNRM',32), key=key2)

anova(mod2, mod3) # should be identical

coef(mod2, simplify=TRUE)
coef(mod3, simplify=TRUE)

fs3 <- fscores(mod3)
cor(fs3, fs)