rust-random / getrandom

A small cross-platform library for retrieving random data from (operating) system source
Apache License 2.0
264 stars 166 forks source link

use_file: switch to futex on Linux and to `nanosleep` on other targets #490

Open newpavlov opened 2 weeks ago

newpavlov commented 2 weeks ago

All non-Linux targets only open file in the critical section, so we probably can remove all synchronization and replace it with a sleep-based wait loop.

On Linux initialization thread may spend a lot of time in the critical section because of wait_until_rng_ready. On entropy-starved (and especially virtualized) systems we may spend tens of seconds in it, so we need a proper synchronization between initializing and waiting threads. Luckily, the futex API is a great fit in this case and allows us to efficiently handle the synchronization.

josephlr commented 2 weeks ago

I don't think we should split how we do synchronization between the different targets. It adds a bunch of complexity for little gain. No contention occurs in the common case.

Also, see https://github.com/rust-random/getrandom/issues/453#issuecomment-2182476778. If we want something better than pthread_mutex_{lock/unlock}, we should either use std::sync::Mutex or just not use a lock at all (and use libc::poll to handle sleeping/waking the threads).

newpavlov commented 2 weeks ago

All other use_file users are relatively minor targets, so I think it makes sense to have a more specialized version of this code for Linux. Especially considering the differences between Linux and the rest (futex availability, /dev/random polling, #[inline(never)] gating). Moving out the Linux stuff makes use_file much cleaner, without any target-specific cfgs.

Not only std mutex adds unnecessary global state, but its implementation has a bunch of machinery which is not needed in our case (e.g. poisoning). Splitting the Linux code also could help with #401, if we'll decide to implement it.

IIUC implementation for the remaining targets uses pthread mutex under the hood, so with this PR we can continue to use pthread mutex, since it was primarily problematic on Linux and not other targets. Considering that non-Linux targets only open file in the critical section, I think we can even remove locking completely and replace it with a simple sleep-based loop similar to what I drafted in #478, without any libc::poll complications.

newpavlov commented 2 weeks ago

Hm, if we are fine with a sleep-based loop for non-Linux targets, I guess we could eliminate the split by using sleep-based implementation of the wait function.

josephlr commented 2 weeks ago

Not only std mutex adds unnecessary global state, but its implementation has a bunch of machinery which is not needed in our case (e.g. poisoning).

The main benefit of std's Mutex is that it's well tested/reviewed and not a custom thing we need to maintain. It doesn't really matter that Mutex supports more functionality than we need, it's already implemented as part of Rust.

newpavlov commented 2 weeks ago

The use of futex wait/wake operations in this PR is very straightforward, it's essentially a textbook-level example, so I don't think the new code requires any particular maintenance.

I removed the code split and dependency on pthread mutex. What do you think about the new code?

josephlr commented 2 weeks ago

The futex wait/wake operations are very straightforward, they are essentially textbook-level examples, so I don't think they require any particular maintenance.

I removed the code split and removed dependency on pthread mutex. What do you think about the new code?

I think this increases the complexity of the implementation without a concrete benefit. This implementation is longer and harder for me to understand than the pthread based code. An implementation using std's Mutex would have nearly identical performance characteristics and would be much shorter/simpler.

If you don't want to use stds Mutex, what would be your thoughts on the second proposal in https://github.com/rust-random/getrandom/issues/453#issuecomment-2182476778 to just not have any locking at all?

josephlr commented 2 weeks ago

Hm, if we are fine with a sleep-based loop for non-Linux targets, I guess we could eliminate the split by using sleep-based implementation of the wait function.

I think its reasonable to get rid of locking for initializing the file descriptor. It's not clear to me if a sleep based approach is better or worse than just opening the file (potentially multiple times) and then closing the file if the thread "loses" the race.

newpavlov commented 2 weeks ago

If we are to close eyes on the sync module for now, are you fine with the rest of the code?

If you don't want to use stds Mutex, what would be your thoughts on the second proposal in https://github.com/rust-random/getrandom/issues/453#issuecomment-2182476778 to just not have any locking at all?

It's essentially what I did in this PR. On non-Linux targets there is no locking, but on Linux we need some kind of locking because wait_until_rng_ready may block for a significant amount of time (it's one of the reasons why I thought that the code split can be a good idea).

libc::poll-based synchronization IMO is more convoluted and hacky than the futex-based code. LazyFd will result in unnecessary opening operations, which may hit ulimit. In the worst case scenario, every thread will open /dev/random to poll it, and after it's ready, they all will rush to open /dev/urandom.

josephlr commented 2 weeks ago

libc::poll-based synchronization IMO is more convoluted and hacky than the futex-based code. LazyFd will result in unnecessary opening operations, which may hit ulimit. In the worst case scenario, every thread will open /dev/random to poll it, and after it's ready, they all will rush to open /dev/urandom.

I mean we would be using libc::poll regardless, that part would be unchanged. As for potentially opening multiple files, I don't think it would happen in practice. Even if a bunch of threads are poll-ing /dev/random they will all be polling the same fd (in my proposal). The only way multiple fds get opened is if two threads race on libc::open itself, and even in that scenario, the extra FD gets closed immediately.

josephlr commented 2 weeks ago

it's one of the reasons why I thought that the code split can be a good idea

Looking at this, I actually think that splitting the code is fine, as we can share common abstractions (if any) between use_file.rs and linux_with_fallback.rs. My main concern is with how the locking/sleeping/waiting is implemented.

josephlr commented 2 weeks ago

If we are to close eyes on the sync module for now, are you fine with the rest of the code?

I would say the contents/complexity of the sync module are the main reason I don't prefer this approach. I agree that if we just were opening shared FDs, we could solve this in multiple nice ways without using std.

newpavlov commented 2 weeks ago

I mean we would be using libc::poll regardless, that part would be unchanged.

Ah, I somewhat misunderstood your idea. I guess it could work, but I think it will be more code and (to my taste) less straightforward. Plus, as you note yourself, this approach will always result in 2 file descriptors being continuously opened instead of 1.

we can share common abstractions (if any) between use_file.rs and linux_with_fallback.rs

Amount of common abstractions depends on whether we use pthread (or std) mutex or the sleep-based loop for non-Linux targets. In the former case the amount is small enough to warrant complete independence. But in the latter case I think it makes more sense to keep code in one module like it's done in the current version.

I would say the contents/complexity of the sync module are the main reason I don't prefer this approach.

I guess perceived complexity depends on how familiar reader is with futexes. Maybe I can amend this by adding comments explaining what exactly wake and wait do?

josephlr commented 1 week ago

I guess perceived complexity depends on how familiar reader is with futexes. Maybe I can amend this by adding comments explaining what exactly wake and wait do?

Looking at the man page for futex I found this:

To reiterate, bare futexes are not intended as an easy-to-use abstraction for end-users. (There is no wrapper function for this system call in glibc.)

This makes me think that futexes (whike very cool) are not really designed for general use, but rather as a way to allow things like std to have fast mutexes.

newpavlov commented 1 week ago

This makes me think that futexes (whike very cool) are not really designed for general use, but rather as a way to allow things like std to have fast mutexes.

Futex is certainly a very low-level API (with certain esoteric corners) which is usually used as a fundamental building block for higher-level synchronization structures. But it's not like it does not get directly used to implement custom synchronization primitives in production code (e.g. some people use it to implement an io-uring-like IPC API on top of shared memory). IMO it's an extremely good fit solution for our problem.