rust-random / getrandom

A small cross-platform library for retrieving random data from (operating) system source
Apache License 2.0
264 stars 166 forks source link

use_file: replace mutex with `nanosleep`-based loop #478

Closed newpavlov closed 2 weeks ago

newpavlov commented 2 weeks ago

Replaces libc::pthread_mutex_t with libc::nanosleep-based wait loop with bounded exponential backoff. This approach is similar to what we use in the VxWorks backend.

This eliminates the problematic dependency on pthread mutex, reduces amount of unsafe and makes code slightly more straightforward. It also resolves concerns about atomic ordering from #469 without hurting the optimistic path.

The main disadvantage of this approach is somewhat higher latency in pathological cases (e.g. spawning thousands of threads which simultaneously call getrandom). This especially could be noticeable on Linux booted without sufficient entropy, i.e. when program spends a significant amount of time in wait_until_rng_ready. But such cases should be extremely rare in practice.

On Linux we could use the futex syscall to implement a simple quasi-mutex based on the FD atomic.

newpavlov commented 2 weeks ago

cc @briansmith

briansmith commented 2 weeks ago

Replaces libc::pthread_mutex_t with libc::nanosleep-based wait loop with bounded exponential backoff. This approach is similar to what we use in the VxWorks backend.

Didn't getrandom replace its previous use of spin locks because of the blog post that pointed out issues with spin locks in getrandom, in PR #125?

My hope is that we can figure out an efficient solution that doesn't require us to implement our own synchronization primitives, or at least limit them to very basic atomics.

As I mentioned elsewhere, I filed https://github.com/rust-lang/rust/issues/126239 so that the language team can clarify the Mutexes semantics to make it clearer that the way we use the libstd Mutex is correct.

newpavlov commented 2 weeks ago

IIRC the main problem with the old spinlock implementation was priority inversion in the polling loop.

On Linux futex-based synchronization should be relatively easy to implement. The problem is with the other targets, which do not have futex-like APIs.

newpavlov commented 2 weeks ago

I will close this PR and will try a different approach based futex on Linux/Android and std::sync::Mutex on other targets in a different PR later.