philchalmers / mirt

Multidimensional item response theory
199 stars 75 forks source link

fscores with `method = "plausible"` ignores seed if mirtCluster was defined #238

Closed netique closed 1 year ago

netique commented 1 year ago

Hello, I’ve encountered some problems when obtaining plausible values for my response patterns. Naturally, wanted to set a seed beforehand to make the analysis reproducible, but fscores seems to just ignore it. It boiled down to mirtCluster being defined, see the reprex below:

library(mirt)
#> Loading required package: stats4
#> Loading required package: lattice

mirtCluster()

fit <- mirt(Science, itemtype = "nominal")
#> Iteration: 1, Log-Lik: -2231.749, Max-Change: 3.43133Iteration: 2, Log-Lik: -1647.755, Max-Change: 0.78640Iteration: 3, Log-Lik: -1629.925, Max-Change: 0.48800Iteration: 4, Log-Lik: -1621.546, Max-Change: 0.44557Iteration: 5, Log-Lik: -1616.776, Max-Change: 0.30268Iteration: 6, Log-Lik: -1613.789, Max-Change: 0.29869Iteration: 7, Log-Lik: -1610.323, Max-Change: 0.20615Iteration: 8, Log-Lik: -1609.697, Max-Change: 0.28886Iteration: 9, Log-Lik: -1609.348, Max-Change: 0.13388Iteration: 10, Log-Lik: -1609.105, Max-Change: 0.11615Iteration: 11, Log-Lik: -1608.975, Max-Change: 0.10205Iteration: 12, Log-Lik: -1608.881, Max-Change: 0.09345Iteration: 13, Log-Lik: -1608.596, Max-Change: 0.06321Iteration: 14, Log-Lik: -1608.565, Max-Change: 0.03301Iteration: 15, Log-Lik: -1608.548, Max-Change: 0.02973Iteration: 16, Log-Lik: -1608.507, Max-Change: 0.01790Iteration: 17, Log-Lik: -1608.500, Max-Change: 0.02676Iteration: 18, Log-Lik: -1608.494, Max-Change: 0.02509Iteration: 19, Log-Lik: -1608.472, Max-Change: 0.00996Iteration: 20, Log-Lik: -1608.468, Max-Change: 0.00382Iteration: 21, Log-Lik: -1608.467, Max-Change: 0.00338Iteration: 22, Log-Lik: -1608.464, Max-Change: 0.03036Iteration: 23, Log-Lik: -1608.462, Max-Change: 0.00312Iteration: 24, Log-Lik: -1608.462, Max-Change: 0.00244Iteration: 25, Log-Lik: -1608.461, Max-Change: 0.00209Iteration: 26, Log-Lik: -1608.460, Max-Change: 0.00238Iteration: 27, Log-Lik: -1608.460, Max-Change: 0.00206Iteration: 28, Log-Lik: -1608.460, Max-Change: 0.02332Iteration: 29, Log-Lik: -1608.458, Max-Change: 0.00177Iteration: 30, Log-Lik: -1608.458, Max-Change: 0.00144Iteration: 31, Log-Lik: -1608.458, Max-Change: 0.00192Iteration: 32, Log-Lik: -1608.458, Max-Change: 0.00042Iteration: 33, Log-Lik: -1608.458, Max-Change: 0.00168Iteration: 34, Log-Lik: -1608.458, Max-Change: 0.00024Iteration: 35, Log-Lik: -1608.458, Max-Change: 0.00416Iteration: 36, Log-Lik: -1608.458, Max-Change: 0.00063Iteration: 37, Log-Lik: -1608.457, Max-Change: 0.00059Iteration: 38, Log-Lik: -1608.457, Max-Change: 0.00021Iteration: 39, Log-Lik: -1608.457, Max-Change: 0.00124Iteration: 40, Log-Lik: -1608.457, Max-Change: 0.00057Iteration: 41, Log-Lik: -1608.457, Max-Change: 0.00036Iteration: 42, Log-Lik: -1608.457, Max-Change: 0.00023Iteration: 43, Log-Lik: -1608.457, Max-Change: 0.02043Iteration: 44, Log-Lik: -1608.456, Max-Change: 0.00064Iteration: 45, Log-Lik: -1608.456, Max-Change: 0.00021Iteration: 46, Log-Lik: -1608.456, Max-Change: 0.00104Iteration: 47, Log-Lik: -1608.456, Max-Change: 0.00061Iteration: 48, Log-Lik: -1608.456, Max-Change: 0.00021Iteration: 49, Log-Lik: -1608.456, Max-Change: 0.00017Iteration: 50, Log-Lik: -1608.456, Max-Change: 0.00168Iteration: 51, Log-Lik: -1608.456, Max-Change: 0.00018Iteration: 52, Log-Lik: -1608.456, Max-Change: 0.00017Iteration: 53, Log-Lik: -1608.456, Max-Change: 0.00250Iteration: 54, Log-Lik: -1608.456, Max-Change: 0.00019Iteration: 55, Log-Lik: -1608.456, Max-Change: 0.00018Iteration: 56, Log-Lik: -1608.456, Max-Change: 0.02339Iteration: 57, Log-Lik: -1608.455, Max-Change: 0.00081Iteration: 58, Log-Lik: -1608.455, Max-Change: 0.00080Iteration: 59, Log-Lik: -1608.455, Max-Change: 0.00022Iteration: 60, Log-Lik: -1608.455, Max-Change: 0.00084Iteration: 61, Log-Lik: -1608.455, Max-Change: 0.00040Iteration: 62, Log-Lik: -1608.455, Max-Change: 0.00033Iteration: 63, Log-Lik: -1608.455, Max-Change: 0.00026Iteration: 64, Log-Lik: -1608.455, Max-Change: 0.00124Iteration: 65, Log-Lik: -1608.455, Max-Change: 0.00016Iteration: 66, Log-Lik: -1608.455, Max-Change: 0.00084Iteration: 67, Log-Lik: -1608.455, Max-Change: 0.00023Iteration: 68, Log-Lik: -1608.455, Max-Change: 0.00020Iteration: 69, Log-Lik: -1608.455, Max-Change: 0.00013Iteration: 70, Log-Lik: -1608.455, Max-Change: 0.00157Iteration: 71, Log-Lik: -1608.455, Max-Change: 0.00007

set.seed(123)
x <- fscores(fit, method = "plausible")

set.seed(123)
y <- fscores(fit,  method = "plausible")

identical(x, y)
#> [1] FALSE

Created on 2023-06-16 with reprex v2.0.2

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.3.0 (2023-04-21) #> os macOS Ventura 13.4 #> system aarch64, darwin20 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz Europe/Prague #> date 2023-06-16 #> pandoc 3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0) #> cluster 2.1.4 2022-08-22 [2] CRAN (R 4.3.0) #> dcurver 0.9.2 2020-11-04 [1] CRAN (R 4.3.0) #> Deriv 4.1.3 2021-02-24 [1] CRAN (R 4.3.0) #> digest 0.6.31 2022-12-11 [1] CRAN (R 4.3.0) #> evaluate 0.20 2023-01-17 [1] CRAN (R 4.3.0) #> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0) #> fs 1.6.2 2023-04-25 [1] CRAN (R 4.3.0) #> glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0) #> GPArotation 2023.3-1 2023-03-21 [1] CRAN (R 4.3.0) #> gridExtra 2.3 2017-09-09 [1] CRAN (R 4.3.0) #> gtable 0.3.3 2023-03-21 [1] CRAN (R 4.3.0) #> htmltools 0.5.5 2023-03-23 [1] CRAN (R 4.3.0) #> knitr 1.42 2023-01-25 [1] CRAN (R 4.3.0) #> lattice * 0.21-8 2023-04-05 [2] CRAN (R 4.3.0) #> lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0) #> MASS 7.3-59 2023-04-21 [1] CRAN (R 4.3.0) #> Matrix 1.5-4 2023-04-04 [2] CRAN (R 4.3.0) #> mgcv 1.8-42 2023-03-02 [2] CRAN (R 4.3.0) #> mirt * 1.38.1 2023-02-28 [1] CRAN (R 4.3.0) #> nlme 3.1-162 2023-01-31 [2] CRAN (R 4.3.0) #> pbapply 1.7-0 2023-01-13 [1] CRAN (R 4.3.0) #> permute 0.9-7 2022-01-27 [1] CRAN (R 4.3.0) #> purrr 1.0.1 2023-01-10 [1] CRAN (R 4.3.0) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.3.0) #> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.3.0) #> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.3.0) #> R.utils 2.12.2 2022-11-11 [1] CRAN (R 4.3.0) #> Rcpp 1.0.10 2023-01-22 [1] CRAN (R 4.3.0) #> reprex 2.0.2 2022-08-17 [1] CRAN (R 4.3.0) #> rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0) #> rmarkdown 2.21 2023-03-26 [1] CRAN (R 4.3.0) #> rstudioapi 0.14 2022-08-22 [1] CRAN (R 4.3.0) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0) #> styler 1.9.1 2023-03-04 [1] CRAN (R 4.3.0) #> vctrs 0.6.2 2023-04-19 [1] CRAN (R 4.3.0) #> vegan 2.6-4 2022-10-11 [1] CRAN (R 4.3.0) #> withr 2.5.0 2022-03-03 [1] CRAN (R 4.3.0) #> xfun 0.39 2023-04-20 [1] CRAN (R 4.3.0) #> yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0) #> #> [1] /Users/netik/Library/R/arm64/4.3/library #> [2] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library #> #> ────────────────────────────────────────────────────────────────────────────── ```

Maybe I am supposed to expect this behaviour, but I still wonder whether this could be fixed or prevented. Docs says that I should not use mirtCluster for any simulation study, but I cannot connect that to my issue.

Many thanks for looking in!

philchalmers commented 1 year ago

Controlling seeds with parallel computations is tricky business, and I don't see much use for general applications in the package (this is one exception, but the easiest solution is to just not use mirtCluster() at all). If you insist on reproducibility then you'll have to define the cluster object yourself so that it is within your control throughout each execution. For instance,

library(mirt)

fit <- mirt(Science)

cl <- parallel::makeCluster(2)
mirtCluster(cl)

parallel::clusterSetRNGStream(cl, iseed = 1234)
x <- fscores(fit, method = "plausible")

parallel::clusterSetRNGStream(cl, iseed = 1234)
y <- fscores(fit,  method = "plausible")
identical(x, y)

This works on the current dev, though likely not on the latest CRAN version.