Reproducibly ergodic random seeds

LupoA / lsdensities

Smeared spectral densities from lattice correlators

GNU General Public License v3.0

3 stars 0 forks source link

Reproducibly ergodic random seeds #43

Closed edbennett closed 3 months ago

edbennett commented 6 months ago

random.seed(1994) is definitely reproducible, but will introduce correlations between otherwise independent analyses, and if applied to parallel code may introduce spurious statistics. This should be adjusted to still be reproducible, but also be more ergodic.

nickforce989 commented 5 months ago

@LupoA, @edbennett, we might have a conversation about this at some point, as it is a release blocker?

edbennett commented 5 months ago

Yes; shall we wait until we're all in adjacent time zones?

nickforce989 commented 5 months ago

Fair enough, so @LupoA, next week would you be available? Both me and Ed are going to be back from the beginning of next week (@edbennett, correct me if I am wrong, please).

LupoA commented 5 months ago

Yes we can meet when you are back. An idea could be to use the Friday slot but I'm open for other choices

LupoA commented 5 months ago

sha256([datapath, prec, tmax, sigma, ...]) -> some number -> seed

nickforce989 commented 5 months ago

I made an attempt to solve this issue and pushed it.

nickforce989 commented 5 months ago

I have also added a test showing that changing parameters the seed is different, and resetting them to be the same, it is reproducible.

LupoA commented 5 months ago

as a check, we can do a histogram of the integers picked by the bootstrap and check they follow a uniform distribution. It is probably an overkill and I am fine with the issue to be closed regardless

LupoA commented 4 months ago

In `

seed = generate_seed(par)

random.seed(seed)

np.random.seed(random.randint(0, 2 ** (32) - 1))`

Would it make sense having the last two lines to be moved inside generate_seed? which could be then called initialise_rng or something like that

edbennett commented 4 months ago

Generating a seed is a separate concern from using it to seed a generator; it arguably makes sense to have a utility function to seed the generator, but how the seed is generated should be separate. (Whether initialise_rng() calls generate_seed(), or takes the seed as an argument, I have no strong opinion on.)