rust-random / rand

A Rust library for random number generation.
https://crates.io/crates/rand
Other
1.67k stars 431 forks source link

ReseedingRng fork handling is ~~broken~~ not as expected #1362

Closed thomcc closed 6 months ago

thomcc commented 11 months ago

The ReseedingRng claims to protect against fork: https://github.com/rust-random/rand/blob/ef89cbefaf484270dc3936d5d32fc2a73314173c/src/rngs/adapter/reseeding.rs#L292-L302, however this doesn't seem to work (at least the way it's used in the thread_rng).

Here's an example that shows that after fork, the parent and all child processes will still have the same state for the thread rng: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=f63e9a09df0ec7f67d84dacb6d39ac32, which produces output like:

parent (pid 11): random::<u64>(): 0x2fe7d602f7aba388
parent (pid 11): spawned child 0 (pid 27)
parent (pid 11): spawned child 1 (pid 28)
parent (pid 11): spawned child 2 (pid 29)
parent (pid 11): spawned child 3 (pid 30)
child 3 (pid 30): random::<u64>(): 0x2907c1d06b61f595
parent (pid 11): spawned child 4 (pid 31)
parent (pid 11): finished spawning 5 child procs
child 4 (pid 31): random::<u64>(): 0x2907c1d06b61f595
parent (pid 11): random::<u64>(): 0x2907c1d06b61f595
child 2 (pid 29): random::<u64>(): 0x2907c1d06b61f595
child 1 (pid 28): random::<u64>(): 0x2907c1d06b61f595
child 0 (pid 27): random::<u64>(): 0x2907c1d06b61f595

I haven't dug deeply into why what you do currently is wrong, but it causes very tough to track-down bugs. CC @joshlf, who filed #1169, leading to the current design of the fork handling.

Background / why I'm stuck with fork: I have code that runs as a Postgres extension (https://github.com/pgcentralfoundation/pgrx), and PG will will run some init code in the parent, but most code runs in a child process (forked per-connection). Both used the uuid crate, which (apparently -- some dep in our tree seems to have turned it on) has a fastrand feature for using the rand crate. At the moment we've avoided use of uuids in that code (it didn't actually need them), but if rand is going to try to handle this at all, it might as well be right.

thomcc commented 11 months ago

I've been told that this was a CVE in openssl https://nvd.nist.gov/vuln/detail/CVE-2019-1549, so perhaps this should have been filed using the security issue template? Sorry if that's the case.

dhardy commented 11 months ago

From the ReseedingRng docs:

When a process is forked on UNIX, the RNGs in both the parent and child processes will be reseeded just before the next call to BlockRngCore::generate, i.e. “soon”. For ChaCha and Hc128 this is a maximum of fifteen u32 values before reseeding.

It turns out that ChaCha now uses a fairly large buffer: [u32; 64]. This is I think to make good usage of SIMD for performance (the downside of general-purpose machinery is that there are often competing interests). That is 32 u64 samples.

Adjusted, your code does as expected (notice that the last two outputs differ):

   Compiling playground v0.0.1 (/playground)
    Finished dev [unoptimized + debuginfo] target(s) in 0.81s
     Running `target/debug/playground`
parent (pid 65): random::<u64>(): 0xd9de1dd7d10c0cce
parent (pid 65): random::<u64>(): 0x2ce847cf78f159a2
parent (pid 65): spawned child 0 (pid 77)
parent (pid 65): spawned child 1 (pid 78)
parent (pid 65): spawned child 2 (pid 79)
parent (pid 65): finished spawning 3 child procs
parent (pid 65): random::<u64>(): 0x690f4fc57306ce72
parent (pid 65): random::<u64>(): 0x40145af20b69885c
parent (pid 65): skipping 26 entries
parent (pid 65): random::<u64>(): 0xde4f622efe029bed
parent (pid 65): random::<u64>():  0x3d4372bc1b6359c
parent (pid 65): random::<u64>(): 0xb45ca70be60f1d34
parent (pid 65): random::<u64>(): 0x28aaead29869efa4
child 0 (pid 77): random::<u64>(): 0x690f4fc57306ce72
child 0 (pid 77): random::<u64>(): 0x40145af20b69885c
child {child} (pid 77): skipping 26 entries
child 0 (pid 77): random::<u64>(): 0xde4f622efe029bed
child 0 (pid 77): random::<u64>():  0x3d4372bc1b6359c
child 0 (pid 77): random::<u64>(): 0xe12f8e5106ffd52f
child 0 (pid 77): random::<u64>(): 0x10fb7c9c4fc194c7
child 2 (pid 79): random::<u64>(): 0x690f4fc57306ce72
child 2 (pid 79): random::<u64>(): 0x40145af20b69885c
child {child} (pid 79): skipping 26 entries
child 2 (pid 79): random::<u64>(): 0xde4f622efe029bed
child 2 (pid 79): random::<u64>():  0x3d4372bc1b6359c
child 2 (pid 79): random::<u64>(): 0x62bb68d1c3db0b8e
child 2 (pid 79): random::<u64>(): 0x277b8034d2dc95bd
child 1 (pid 78): random::<u64>(): 0x690f4fc57306ce72
child 1 (pid 78): random::<u64>(): 0x40145af20b69885c
child {child} (pid 78): skipping 26 entries
child 1 (pid 78): random::<u64>(): 0xde4f622efe029bed
child 1 (pid 78): random::<u64>():  0x3d4372bc1b6359c
child 1 (pid 78): random::<u64>(): 0xa5b1692ff016ec85
child 1 (pid 78): random::<u64>(): 0xddf96801eec7a7d2

I.e. this behaves exactly as advertised.


Now, I agree that this type of design compromise (a mixture of performance and security goals where the exact guarantees are less than obvious) is not ideal. This is why this issue was opened recently. What to do about it is under discussion.

Also note: this is not a professionally funded library, just a community project (without even donation funding). Possibly we should try to change that but doing so would need support from more than just a few individuals.

thomcc commented 11 months ago

I guess. I think the code should change to regenerate the buffer (or clear the counter). There'd be no regression for the normal use, and you'd fix issues like this. You would want to only do this in the post-fork child, rather than in all 3 atfork callbacks, in order to not pessimize process spawning though.

As it is, I don't really get why you'd bother with the fork handler.

dhardy commented 11 months ago

There'd be no regression for the normal use, and you'd fix issues like this.

Given that ThreadRng uses thread-local memory and there is no way to call into this, there would be a perf. regression: we'd need a to check for fork on every single call to thread_rng.

Perhaps with atomics it wouldn't be too bad. Perhaps having an entirely thread-local RNG is just the wrong design anyway, especially considering how large a cache we use to maximise throughput benchmarks.

As it is, I don't really get why you'd bother with the fork handler.

... yes. The buffer size increased over time, but still the only purpose is to make long-running processes eventually get independent RNGs.

dhardy commented 6 months ago

Removed in #1379