Originally posted by **MauroCE** May 4, 2024
I really like this package and it's very intuitive to use :) One thing that would make me love it even more would be to allow the user to provide/fix the random state so that results can be exactly reproduced. I am still getting familiar with the abstractions and inner workings of the package, but I was thinking something like this could work:
1. Defining a function in `utils.py` that generates an instance of `np.random.Generator` given either a seed or another rng
```
def setup_rng(seed, rng):
assert (seed is None) or (rng is None), "At least one of `seed` or `rng` must be None."
if rng is None:
if seed is None:
seed = np.random.randint(low=1000, high=9999)
rng = np.random.default_rng(seed=seed)
return rng
```
2. Add reproducibility for resampling schemes by modyfing `rs_doc`, `resampling_scheme`, `resampling`. Mostly this requires adding an extra argument `rng=None` to the various resampling functions (as well as `uniform_spacings` and the queue). E.g.
```
from particles.utils import setup_rng
rs_doc = """\
Parameters
----------
W : (N,) ndarray
normalized weights (>=0, sum to one)
M : int, optional (set to N if missing)
number of resampled points.
rng : np.random.Generator, optional (instantiated at random if missing)
random number generator for reproducibility
Returns
-------
(M,) ndarray
M ancestor variables, drawn from range 0, ..., N-1
"""
def resampling_scheme(func):
"""Decorator for resampling schemes."""
@functools.wraps(func)
def modif_func(W, M=None, rng=None):
M = W.shape[0] if M is None else M
return func(W, M, rng=rng)
rs_funcs[func.__name__] = modif_func
modif_func.__doc__ = func.__doc__ + rs_doc
return modif_func
def resampling(scheme, W, M=None, rng=None):
rng = setup_rng(seed=None, rng=rng)
try:
return rs_funcs[scheme](W, M=M, rng=rng)
except KeyError:
raise ValueError(f"{scheme} is not a valid resampling scheme")
@resampling_scheme
def systematic(W, M, rng=None):
"""Systematic resampling."""
su = (rng.uniform(low=0.0, high=1, size=1) + np.arange(M)) / M
return inverse_cdf(su, W)
```
3. Add `rng=None` to the arguments of `ArrayMCMC` and store it as an attribute. Then `ArrayMetropolis`, `ArrayRandomWalk` etc can use it within the `step()` and `proposal` methods respectively. When one instantiates a `MCMCSequence` one would pass an optional `rng`. In the same way, `rng` is passed down from the `FKSMCsampler`, which gets handed it down from e.g. `Tempering` or `AdaptiveTempering`.
The same thing can be done for `FeynmanKac` and `SMC`. I tried it on my machine and it seems to work smoothly. One could probably work directly with `rng` and the argument `seed` in `setup_rng` would be redundant. I find that, depending on what I am trying to do and compare results with, I sometimes need to pass a seed or an rng, so could be an option to leave both.
Let me know what you think :)
Discussed in https://github.com/nchopin/particles/discussions/86