Closed s3alfisc closed 1 year ago
I anyways should be much clearer about this in the documentation of the seed
argument.
I have decided to deprecated the seed
argument. Here's what I have prepared for the news.md
:
News v0.13
Prior to the changes introduced in v0.13
, boottest()
will always call set.seed()
or dqrng::dqset.seed()
internally, regardless of whether the seed
argument is specified or not (in the ladder case, it will create an internal seed by randomly drawing from a large set of integers). I considered this harmless, as setting seeds inside boottest()
in this way does not affect the reproducibility of scripts run end-to-end.
However, I have learned that is generally considered bad practice to overwrite global variables without notification - for example, the authors of numpy have deprecated their np.random.seed()
function for this reason.
Here is a quick example on what happens if a function "reseeds": it affects the future chain of random draws.
fn_reseed <- function(x){set.seed(x)}
set.seed(123)
rnorm(1)
# [1] -0.5604756
fn_reseed(1)
rnorm(1)
# [1] -0.6264538
set.seed(123)
rnorm(1); rnorm(1)
# [1] -0.5604756
# [1] -0.2301775
The two 'second' calls to rnorm(1)
are based on different global seed states.
As a result, I have decided to deprecate the seed' function argument. Random number generation must now **be** set outside of
boottest()using
set.seed()and
dqrng::dqset.seed()`.
This means that bootstrap results generated via versions < 0.13 will no longer be exactly replicable under the new version, but with a sufficiently large number of bootstrap iterations, this change should not affect any of your conclusions.
The
seed
argument ofboottest()
allows to fix a random seed for random number generation within boottest.boottest()
then calls eitherset.seed()
(for Mammen weights) ordqrng::dqset.seed()
for all other weights. Both functions manipulate the global seed state, and therefore influence what happens outside ofboottest()
.Here is a quick example:
set.seed()
is not "scoped" withinunintentionally_set_global_seed ()
, it overrides the global seed state, and in consequence,rnorm(1)
generates the same random variable.One fix is to use R's
on.exit()
function:Now we get
Calling
set.seed()
is scoped withinset_local_seed()
.Unfortunately, for the
dqrng
package, it is currently not possible to fetch the global seed state (as via.Random.seed
for the 'standard R' random seed).In consequence, the option above is not feasible. So should I drop the
seed
argument?Here is an example with
boottest()
:I actually don't think that there is a big problem here, as all results are still fully reproducible if scripts are run from top-to-bottom in a clean working environment. Nevertheless, I might drop the
seed
argument in a future release.Related: https://github.com/daqana/dqrng/issues/41