Closed Deleetdk closed 5 years ago
Agreed, the scoring key should certainly be adjusted as well whenever the data is modified to model to the package internals. Thanks for bringing this to my attention, I'll take a look at how this can be modified using the existing functions.
Patch should fix your observation. I used the following code to test the issue. Thanks for bringing this to my attention!
library(mirt)
data(SAT12)
#correct answer key
key <- c(1,4,5,2,3,1,2,1,3,1,2,4,2,1,5,3,4,4,1,4,3,3,4,1,3,5,1,3,1,5,4,5)
scoredSAT12 <- key2binary(SAT12, key)
mod <- mirt(scoredSAT12, 1)
fs <- fscores(mod)
mod2 <- mirt(SAT12, 1, rep('2PLNRM',32), key=key)
fs2 <- fscores(mod2)
cor(fs, fs2)
SAT12mod <- SAT12
SAT12mod[,1:10] <- SAT12[,1:10] + 2
key2 <- key
key2[1:10] <- key[1:10] + 2
mod3 <- mirt(SAT12mod, 1, rep('2PLNRM',32), key=key2)
anova(mod2, mod3) # should be identical
coef(mod2, simplify=TRUE)
coef(mod3, simplify=TRUE)
# ---------------------------------------------
# reorder some items
SAT12mod <- SAT12
SAT12mod$Item.1 <- ifelse(SAT12$Item.1 == key[1], 0, SAT12$Item.1)
SAT12mod$Item.2 <- ifelse(SAT12$Item.2 == key[2], 10, SAT12$Item.2)
head(SAT12)
head(SAT12mod)
key2 <- key
key2[1:2] <- c(0,10)
mod3 <- mirt(SAT12mod, 1, rep('2PLNRM',32), key=key2)
anova(mod2, mod3) # should be identical
coef(mod2, simplify=TRUE)
coef(mod3, simplify=TRUE)
fs3 <- fscores(mod3)
cor(fs3, fs)
Many times item data come in response option format where the options aren't natural numbers, or may have missing values. In that case,
mirt()
will recode the data and throw a warning of Item re-scored so that all values are within a distance of 1
. However, using most of algorithms for nominal data, this is not a problem, but when using the2PLNRM
(etc.) models, a scoring key is needed as input. However, because ofmirt
's recoding of the data, the scoring key will generally not fit the data. Thus, the user actually needs to recode the data himself before callingmirt()
.There is no particular need for this trap. I suggest modifying the
mirt()
so that this issue is handled internally. In the meanwhile, here's a small function that does the recoding (which can also be used internally).Output of test is:
I have tested this solution using the vocabulary test data found here:
https://openpsychometrics.org/_rawdata/
Modified version:
vocab_example.csv.zip
Example analysis:
In this case, all the nominal scorings of the data are worse than the binary scoring (judging by correlation to criterion variables), except the 2PLNRM which is about the same (r = .99). I was inspired by this paper, but apparently no benefits in this dataset. https://www.mdpi.com/2079-3200/7/3/17