Open vks opened 6 years ago
Your point is that it's possible to produce very fast cryptographic generators for bulk output by using CPU-specific features? Cool. But I don't much like comparing apples to pineapples.
I suggest this would make a good external crate, but not be such a good inclusion with Rand itself since:
ReseedingRng
going to be sufficient?To be fair, we should probably consider SIMD-optimised generators at some point, though I don't know whether it would be worth making them the default generators (for many applications where randomness is consumed in small quantities the performance enhancements may not be significant).
To be fair, we should probably consider SIMD-optimised generators at some point, though I don't know whether it would be worth making them the default generators (for many applications where randomness is consumed in small quantities the performance enhancements may not be significant).
Could always pick a SIMD specialisation on when user calls a fill_buffer with a larger buffer, and otherwise use the non-SIMD implementation with less throughput but possibly better latency characteristics.
Yes, but SIMD probably makes more sense with buffer-backed RNGs like the current cryptographic ones (generate a block at once).
@dhardy
The speed comes from using SIMD instructions to produce large blocks of data. Many users of RNGs only want small amounts of random data at a time, so this is not a good general purpose RNG.
Why? Our current StdRng
is buffered as well, via BlockRng
. If I use a similar buffer (128 bytes), I get slightly worse performance than StdRng
on master (5.1 ns instead of 3.55 ns per u64
). Initialization is a lot cheaper (31 ns instead of 7.6 us) and the key is erased every 128 bytes.
Fast key erasure is likely mainly of interest to specific users with very high security requirements? It appears a "sensible precaution" but one with high cost and little expected benefit;
It gives forward secrecy. This is a nice property to have, because it means that i.e. compromising a server does not compromise previously generated secrets (if they are not stored in some other form). In that regard it is similar to ReseedingRng
, but more reliable. I'm not sure what you mean with high cost? The total cost is comparable to StdRng
. Removing key erasure improves next_u64
performance by about 40 % (3.08 ns, making it faster than StdRng
).
CPU-specific instructions mean your code only works on modern x86_64 CPUs; Rand is designed to be portable.
Rand can still be portable by falling back to something else on other platforms.
Low-level instructions are hard to review, so it would be better in its own crate so that individual projects can choose whether or not to use it IMO.
That is a fair point, it contains a bit of unsafe code. I hope this can be improved by using a safe abstraction like faster
and maybe aesni
.
@vks Nice project!
I tried to read up a bit on the design of this RNG, but I am not there yet :smile:. Is my understanding right?
The basic idea is that common RNGs for cryptography keep some of their generated values in memory, and this memory can be leaked (swapping to disk, inspecting process memory, etc.). This RNG by design has to overwrite a generated value when making the next one, so it is secure by default. It is very fast because it uses a hardware implementation of AES, and because it does not follow the clunky implementation requirements of NIST.
I sure think it is neat. It may be good to wait a while before really using it to give it time to prove itself. And also good to take lessons from. One thing I had planned for the BlockRng
wrapper was to erase used values.
But I see no reason why this RNG has to be in Rand, at least not right now. We are moving towards no, or as few as possible, RNG implementations, not many.
This is certainly not something for the near future (the algorithm, the implementation and SIMD in Rust are quite immature). I was just wondering whether it might be a better default in the long run, and the point of this issue is to discuss the pros and cons.
Is my understanding right?
Yes, I think it is.
Renamed the issue from "Faster RNG for filling buffers" to "Key erasure AES variant", as this is more about a new CSPRNG than about filling buffers.
This article offers an improvement over using AES in counter mode, in that it offers backtracking resistance at lower cost or a better API.
If at some point the internal state of the common variant is exposed, all previously generated values can be recovered. The NIST document has some requirements that the common variant should overwrite its state when the user is 'done with it', but that is not easy for an API.
This variant will, after generating a buffer with a couple of blocks of values, also use one block to generate a new internal state (or seed). This overwrites the old key, making it impossible to recover previous rounds.
Another fast backtracking-resistant RNG: https://github.com/google/randen
Note that Randen provides a Rust implementation in the official repository.
The Randen paper has been published: https://arxiv.org/abs/1810.02227
We shouldn't forget about Randen as a possible alternative to ChaCha; it is fast, fairly simple code and has backtracking resistance. Right now I see two issues:
This would provide an alternative to #872, for some platforms.
I ported an implementation of a fast-key-erasure RNG suggested by djb to Rust: https://github.com/vks/aesrng
It is about two times faster than non-SIMD xoroshiro when generating 100 MiB of random data. Only
fill_bytes
is implemented for now.Once
stdsimd
is stabilized, this might be of interest for Rand.