Tracker: proposed RngCore changes

dhardy commented 2 years ago

I think it may be useful to have fill methods for u32/u64. In some (AFAIK relatively rare) cases they may result in a significant acceleration, especially for PRNGs like ChaCha which can use SIMD to produce several blocks in parallel. Also they should result in less memory stores if a PRNG state can fit into registers. Those methods could replace the next_* methods, by marking the fill methods #[inline] the &mut [T; 1] case should be trivially optimizable by compiler. And it also should be possible to introduce them gradually (introduce them implemented in terms of next_* first, then mark next_* methods deprecated, and eventually remove them in the next breaking release), limiting the migration churn.

It may be worth trying this.

dhardy commented 2 years ago

Further to the topic of RngCore changes:

1147: bit-level sampling (closed)
1080: fill uninitialised arrays (closed; possible revisit given good evidence but unlikely)
1031

dhardy commented 1 year ago

@newpavlov proposed using generic_const_exprs to let the RNG generate an array like [Self::Item; Self::ITEMS].

newpavlov commented 1 year ago

This is response to https://github.com/rust-random/rand/pull/1273#issuecomment-1341261828.

In the long term, do we even keep BlockRng?

I think it's worth to keep BlockRngCore since it allows to remove boilerplate for almost all PRNG implementations. But I think it's worth to somehow merge BlockRng and BlockRng64 into one wrapper type. I will try to play with the wrappers in a future PR.

The current design of BlockRngCore (and the proposed generic_const_exprs-based version) is not ideal. One issue for which I don't have a good solution is variability of block size dependent on runtime target feature detection. For example, ChaCha has different optimal block sizes dependent on whether hardware has SIMD extensions. Right now we simply use the largest supported block size, which is a bit suboptimal.

One issue with your proposed ByteRng is that it handles alignment poorly, which may reduce performance of PRNG implementations. Also, what do you plan to use for LEN in ByteRng impl for OsRng?

dhardy commented 1 year ago

ByteRng is basically a variant of [BlockRngCore]. Possibly not better. But I don't think there's an issue with alignment when the array is a return value. Your suggestion is probably a better variation on this.

OsRng

Right, this is neither a block-RNG nor a word (u32 / u64) generator. We might want to keep RngCore::try_fill_bytes as the primary implementation method of OsRng. Though there have been calls for something supporting uninitialized memory (but as said, not really required here).

dhardy commented 1 year ago

Given RFC 1672 we could do something like this:

pub trait WordRng {
    type Word;
    fn next(&mut self) -> Self::Word;
}

impl<W: WordRng<Word = u32>> RngCore for W { ... }

impl<W: WordRng<Word = u64>> RngCore for W { ... }

// keep trait BlockRngCore and
// impl<R: BlockRngCore<Item = u32>> RngCore for BlockRng<R> { ... }

Drawback: it still doesn't let algorithms depend on the RNG word size.

Variant: include fn fill_bytes in WordRng and impl<R: BlockRngCore<..>> WordRng for BlockRng<R> { .. }. Then algorithms can depend on RNG word size, at the cost of not supporting type-erased RNGs (required anyway if the algorithm variant is selected at compile-time).

dhardy commented 1 year ago

Related, but I think the traits need to stay separate: #1285 reasons that RNGs with hidden/platform-dependent internals like SmallRng should not have implementation-dependent details like SeedableRng::Seed and from_seed.

My proposed redesign moves Seed and from_seed to a new trait, FromSeed, which is not implemented by SmallRng or StdRng.

rust-random / rand

Tracker: proposed RngCore changes #1261

1147: bit-level sampling (closed)

1080: fill uninitialised arrays (closed; possible revisit given good evidence but unlikely)

1031