Open larsoner opened 3 years ago
The new Generator class docstring says:
No Compatibility Guarantee
Generator does not provide a version compatibility guarantee. In particular, as better algorithms evolve the bit stream may change.
I think it's worth asking what we use RNGs for.
RandomState
approach is actually better than switching to the new Generator
/ default_rng
approach, right?numerics.random_permutation
This is a utility designed to emulate MATLAB's randperm
for sanity-check comparisions. This will probably break?mxne
, ICA Here I don't think I understand well enough how the randomness is used to say what the situation is.I think for us we want everything to always be reproducible for a given int random_state
(and a given mne.set_rng_backend
). I hadn't looked at Generator / default_rng, it is indeed a problem that it does not guarantee stability.
I wonder if we should just switch EDIT: via deprecation as above to PCG64, which is guaranteed to be stable:
Compatibility Guarantee
PCG64 makes a guarantee that a fixed seed and will always produce the same random integer stream.
But is better than RandomState:
By default, Generator uses bits provided by PCG64 which has better statistical properties than the legacy MT19937 used in RandomState.
agree on the plan to switch (with deprecation cycle) to explicitly use PCG64 (which happens be be the current default for np.random.default_rng
, but that might change).
scipy will deprecate RandomState ?
scikit-learn has not switched yet so I don't feel the urgence
scipy will deprecate RandomState ?
There is a long ongoing discussion in the SciPy PR DOC: remove references to RandomState (#13778). The short version so far to me is that RandomState will continue to work at least for some time but should not (and will not) be the preferred way to do things in examples and parameters.
Hi everyone! I'm the cause of everyone's problems here. As you work through these issues, I do recommend reading NEP 19 which goes into more detail about why we had the old backwards-compatibility policy and why we moved away from it. The concerns you have brought up here were definitely in our thoughts. I'm happy to answer any questions that the NEP doesn't answer.
I can't speak definitively as to when/how/if scipy really deprecates RandomState
per se, but I have my guess as to where we're going on the aforementioned PR. Some time after we drop support for numpy 1.16, I expect that we will transition the code to use Generator
instead of RandomState
. I expect that we'll still accept RandomState
instances, but we will wrap its core uniform PRNG in Generator
and call its methods instead. So if you call scipy.stats.norm.rvs(random_state=RandomState(1234))
, you will get the results from Generator.standard_normal()
's improved algorithm rather than the older, slower RandomState.standard_normal()
. The core Mersenne Twister PRNG underneath the passed RandomState
will still be used, however. Again, we haven't made any decisions, yet, but I think that's where we are going.
Thanks for the info @rkern . After some reading and discussion I think our plan is to shelve this for now until we bump our minimum numpy requirement to 1.17+ and then switch everything over to using Generator syntax internally (probably still using MT19937 for backward compat), which will likely happen in the next year.
Sounds like a good plan.
np.random.RandomState is not a good choice for random number generation going forward:
I don't understand all the details, but the TL;DR I've gathered from monitoring some SciPy discussions is that it's not as good as the new
default_rng
/Generator
-based way of making random numbers and it should be replaced in people's code.In the long term I think we want to switch
random_state=0
to usedefault_rng(random_state)
instead ofRandomState(random_state)
internally. I propose:mne.set_rng_backend(backend)
that can be'RandomState'
or'Generator'
, maybe with ause_rng_backend
context manager. The context manager is maybe overkill, but I do think it might help people transition code from old to new, and it's only a few lines for us.random_state=<int>
they will see a DeprecationWarning stating that the default is'RandomState'
in 0.23 but will change to'Generator'
in 0.24 (or maybe this is a good case for making it 0.25 to give people more time to change over), and we will callset_rng_backend('RandomState')
.'Generator'
.I think this is a workable solution for modernizing our
random_state
params.