Closed Pr0methean closed 12 months ago
This means that if they use a regularly reseeded PRNG such as thread_rng(), they're likely to come up for reseeding at about the same time and therefore block as they contend to access /dev/urandom and it gains a backlog of seed-generating work.
We do not use blocking randomness sources outside of an initial check of entropy pool initialization, so there is usually no "seed-generating work" triggered by re-seeding of ThreadRng
. I highly doubt that it's possible to observe a measurable impact on performance caused by your scenario. Even if we are to assume that OS needs to get an exclusive block for some inner resource (a very big assumption), threads will quickly get out of sync with each other and even difference in start times between threads should be sufficient enough.
Also, if such possibility is your concern, then you probably should not use ThreadRng
in the first place. Instead you should use a faster (possibly, non-cryptographic) PRNG which gets initialized only once per worker thread.
I think we are only reseeding for "key erasure", i.e. it's a security feature irrelevant to your use case. (I think there are better alternatives for key erasure, but this is not relevant to this issue.)
If you care about performance, I strongly recommend against using thread_rng()
. Just use StdRng
or SmallRng
directly. If you care about reproducibility, something like rand_xoshiro::Xoshiro256PlusPlus
might be better (assuming you don't need cryptographic unpredictability for your application). You could even use jump()
to guarantee 2^128 non-overlapping sequences.
I'd still feel more confident in the results of a Monte Carlo simulation if I knew each thread was being reseeded in the time it would take to "learn" any patterns from the current seed -- after all, you never know whether your simulation will turn out to depend on subtle patterns in your PRNG and accidentally become a randomness test, especially if you're using evolutionary optimization. I was thinking more along the lines of a "mostly best-efforts" approach to reseeding -- e.g. start non-blocking attempts to reseed at 32KiB of output, but don't block waiting for a seed until 128KiB of output.
start non-blocking attempts to reseed at 32KiB of output, but don't block waiting for a seed until 128KiB of output.
As I wrote above, we do not use "blocking" entropy sources outside of potential initialization checks which usually happen only once during program execution.
I don't think this issue is actionable, ThreadRng
is arguably should not be used in the author's use case and a custom solution should be used instead (e.g. fast CSPRNG initialized once or periodically re-seeded fast non-cryptographic PRNG), so I am going to close this issue.
Background
This FR is designed for Monte Carlo simulations.
Feature request
When multiple threads are running instances of the same simulation, it's likely they'll generate pseudorandom numbers at about the same rate. This means that if they use a regularly reseeded PRNG such as
thread_rng()
, they're likely to come up for reseeding at about the same time and therefore block as they contend to access /dev/urandom and it gains a backlog of seed-generating work. To fix this, different threads' PRNGs could be set to output different staggered numbers of bytes (I'd suggest the binary Van der Corput sequence linearly transformed to the 64KiB~128KiB range) before their respective first reseedings. Another option would be to have one dedicated thread that reads seed bytes into a ring buffer, stopping only when the buffer is full, and have each PRNG read seeds from it (although this probably couldn't be both lockless and starvation-free).