wlandau / crew

A distributed worker launcher
https://wlandau.github.io/crew/
Other
129 stars 4 forks source link

Test fails due to incomplete specification of random seed (RNGkind) #112

Closed shikokuchuo closed 1 year ago

shikokuchuo commented 1 year ago

Prework

Description

mirai 0.9.1.9019 would break crew tests if not for a workaround .onLoad hook which sets the RNGkind to "L'Ecuyer-CMRG" if crew is detected. This is not wrong, but also not necessary.

https://github.com/shikokuchuo/mirai/blob/d6b1cfafc0460ae57a17fdaba6ece35182d5d52f/R/mirai-package.R#L65C1-L66C34

The failing test is: https://github.com/wlandau/crew/blob/0bf1273e6ed14c334dd75ba684dabee1a4d70465/tests/testthat/test-crew_controller_local.R#L100

This is because in creating the local variable exp, you just use set.seed(0L) on line 97 without specifying the RNGkind. The worker is now using "L'Ecuyer-CMRG" hence leading to a mismatch. A statement RNGkind("L'Ecuyer-CMRG") in the testthat script would be one solution.

I think currently the plan is that mirai is released prior to crew, so it might have to go out with the workaround unless you can think of something better. Once the test is fixed I can then remove this.

Happy to provide background on the changes separately / on a call.

wlandau commented 1 year ago

Thanks for the FYI. I think this should be fixed in https://github.com/wlandau/crew/commit/5cfd9046a378aca811d46e217569dd388e852d98. However, that part of the test is already suppressed on CRAN via the NOT_CRAN envvar, so I don't think it will impact either of our release cycles.

https://github.com/wlandau/crew/blob/5cfd9046a378aca811d46e217569dd388e852d98/tests/testthat/test-crew_controller_local.R#L82-L84

Curious as to the motivation for L'Ecuyer-CMRG, given how popular and successful Mersenne-Twister is. Did you notice a performance difference?

shikokuchuo commented 1 year ago

Thanks! I didn't see that it was in a NOT_CRAN block. I have reverted the workaround in ecf7f7d given that it won't trip the revdeps check.

I've also tested your patch and doesn't quite work - I'll open a PR with a minor change. Just need to set the seed after calling RNGkind as that function also resets the RNG state.

It's true that most of the performance enhancements to nanonext have probably been taken off the table, but would be pretty amazing if the next thing I thought of to optimize was the RNG! It is rather due to L'Ecuyer having better statistical properties for parallel computing - as I noted in the changelog, all credit to Luke Tierney (R core).

shikokuchuo commented 1 year ago

FYI I have provided further explanation on the motivation in #113