Closed newAM closed 5 days ago
I think it makes sense to add such a trait to rand_core
, just to make interoperability easier.
There are some open questions about how this would interact with the existing traits. For instance: Can we implement Rng
for all types that implement AsyncRng
?
One other question is if rand_core
or rand
should have any types that implement RngAsync
. The only ones I can think of would be our various Rng adapters like:
BlockRng
, BlockRng64
(if the underlying R
implements RngAsync
)ReseedingRng
(if the underlying Rsdr
implements RngAsync
)It might actually make more sense have an async version of BlockRngCore
. This abstraction would seem to better match the underlying hardware. It would also discourage using the (presumably slow) hardware RNG directing, and instead incentivizing use of it through a SeedableRng
or ReseedingRng
.
This issue is really more of a question than a feature request: Is rand_core the appropriate place to add async RNG traits?
So, will there be direct interoperability between async and synchronous RNGs? If so this may make sense; if not a dedicated crate may be preferable(?).
Can we implement Rng for all types that implement AsyncRng?
How? Technically, yes, by spinning until poll
is ready, but futures are usually waited on by an executor. But there is no standard executor and this is not a good place to be opinionated.
The reverse, implementing AsyncRng
for every RngCore
, would be easy, and perhaps makes more sense: users of RngCore
will block until a result is yielded; users of AsyncRng
can use their executor for concurrency.
Note that if we do this, a type cannot directly support both async and sync usage. But if we don't, an adapter is required to use a sync RNG in an async function; this is probably fine, thus it may be better not to have any auto impl.
It might actually make more sense have an async version of BlockRngCore
Is your point that ReseedingRng
could still implement sync behaviour by requesting a fresh seed in a future which is polled on each request for bytes, only doing the actual reseeding once poll
returns Ready
? That might work (perhaps with some limit before it blocks, for security reasons).
Or is it simply that derived RNGs might implement both RngCore
and AsyncRngCore
depending on what their underlying RNG implements? Sure.
Another question: should getrandom
support async usage? If so we can have AsyncOsRng
(or OsRngAsync
). But this doesn't need to be answered now.
Can we implement Rng for all types that implement AsyncRng?
The reverse, implementing AsyncRng for every RngCore, would be easy, and perhaps makes more sense: users of RngCore will block until a result is yielded; users of AsyncRng can use their executor for concurrency.
Note that if we do this, a type cannot directly support both async and sync usage. But if we don't, an adapter is required to use a sync RNG in an async function; this is probably fine, thus it may be better not to have any auto impl.
This is a design decision, but personally I would keep these separate. The choice to impl
an async trait vs a sync trait should be representative of what the underlying hardware/code is doing.
It might actually make more sense have an async version of BlockRngCore. This abstraction would seem to better match the underlying hardware.
This would be a good fit for the hardware RNG I am currently working with. Hopefully other embedded users can comment on what the ideal trait would be.
It would also discourage using the (presumably slow) hardware RNG directing, and instead incentivizing use of it through a SeedableRng or ReseedingRng.
I should explain the cycle counts a bit more; the question asked in the embedded-rust matrix chat was if async RNG traits make sense at all. Polling is faster than async if the time to switch context is longer than polling hardware for completion. Based on the numbers I have available I do think that there are valid use-cases for async RNG traits; but they are definitely more specialized.
Sidetracking for a moment; the STM32WL hardware is quite fast as compared to software algorithms.
Source | Cycles per [u32; 4] |
---|---|
ChaCha20Rng |
2,875 |
ChaCha12Rng |
1,764 |
ChaCha8Rng |
1,216 |
STM32WL HW RNG | 412 |
That being said there are still valid reasons to use software random number generation when hardware acceleration is available:
So, my current understanding from the above:
[u8]
with some Result<(), Error>
indicatorBlockRngCore
would be a useful batching system, but presumably with async outputI can add a fourth reason to the above list to prefer software RNGs: verifiability. Hardware RNGs are much harder to scrutinize than software RNGs, and we have seen failures in RDRAND.
To the main topic here: async
RNGs. I have seen no interest in asynchronous random number distributions or shuffling/sampling, nor would this be simple, thus the topic appears to be limited to generation and perhaps caching.
This requires no interaction with the rest of rand
, hence is best left to another crate.
It would be easy enough to adapt a synchronous RngCore
RNG to operate through an asynchronous AsyncRng
trait; this also does not need any new code within rand
or rand_core
(or rand_chacha
etc.).
Presumably the interest in having an async
variant of rand_core
is to then implement its trait for various hardware RNGs. This would fit better under the purview of an embedded project, though I guess we could add a rust-random/rand-async
repo for this (assuming someone else offers to maintain this).
In the mean-time, I'm closing this issue.
Background
The
embedded-hal
crate provides traits for embedded hardware. In #291embedded-hal
removed their RNG trait in favor ofrand_core
.embedded-hal
is now working on addingasync
traits using GATs in #285.This issue is really more of a question than a feature request: Is
rand_core
the appropriate place to addasync
RNG traits? If there are use-cases forasync
RNG traits outside of embedded rust I think it would be a good idea to include this inrand_core
, otherwise it would probably be more appropriate forembedded-hal
.What is your motivation?
Developing
async
on embedded targets has been progressing nicely, and with GAT stabilization on the way it will soon be possible to write#![no_std]
friendlyasync
traits on rust stable.HW RNG peripherals are fast, but can still benefit from
async
methods that allow the CPU to yield to other tasks while waiting for the hardware to generate entropy. On the STM32WL (ARM Cortex-M4), generating a[u32; 4]
block of entropy takes ~2662 CPU cycles on a cold-boot, and ~412 cycles when a steady state is reached. Anasync
context switch takes a minimum of 62 cycles using the embassy executor.What type of application is this?
Hardware entropy generation for embedded development on
#![no_std]
targets.Feature request
Add an
async
RNG trait.This would look something like this (prior art from embassy-rs):