Closed pitdicker closed 5 years ago
Why not have one crate per RNG?
On Sun, May 6, 2018, 15:40 Paul Dicker notifications@github.com wrote:
As decided in dhardy#58 https://github.com/dhardy/rand/issues/58.
The idea is to only keep the StdRng and SmallRng wrappers in Rand, and to not expose PRNG algorithms directly.
The field of PRNG algorithms is not static, it seems like a good idea to cut Rand partly loose from the developements to improve our backward-compatability and usability story. Advantages (from the original issue)
- rand provide easy to reason generator(s) to satisfy the bulk of the users
- rand provides low-friction upgrades and keeps the optimal-ish algorithms readily available
- rand will not accumulate and/or deprecate specific algorithms throughout time
- Specific rand Rng names won't bleed out to the world, like in forum posts, stackoverflow, etc..
Instead we want to provide an official companion crate, rand_rngs, with a couple of blessed implementations.
Whether Rand should depend on rand_rngs or copy the two algorithms used by StdRng and SmallRng is not fully decided. It might reduce code size if a crate depends on both StdRng and its algorithm (currently Hc128Rng) directly, but that seems rare and not worth much. It has the disadvantage that both implementations should remain in sync. Maybe the CI can check this?
If Rand depends on rand_rngs, it would restrict rand_rngs to only depend on rand_core. But that is what rand_core is for. I am for having Rand depend on rand_rngs. rand_rngs
rand_rngs should provide a good selection of PRNGs.
Including multiple PRNGs in one crate has the advantage of having a single place to provide guidance, as it helps comparing different algorithms with each other. Also it gives one place to offer consistent benchmarks, to run PRNG test suites, to keep up a common quality level, and possibly to develop functionality that is useful for more than one PRNG (e.g. jumping).
Which PRNGs to include has been the topic of endless discussions. A few are basically decided. I fully expect the number of PRNGs to grow over time. But we also should be careful not to expose too many, as there are hundreds. Every PRNG should have one thing that gives it a clear advantage over others, otherwise it is not worth the inclusion. Normal PRNGs
dhardy#60 https://github.com/dhardy/rand/issues/60 explored normal PRNGs. The following 5 I feel comfortable about for an initial version of rand_rngs:
- PCG-XSH-RR 64/32 (LCG)
- PCG-XSL-RR 128/64 (MCG)
- SFC (32-bit variant)
- SFC (64-bit variant)
- Xoroshiro128+
The two PCG variants offer good quality and reasonable performance. SFC provides high performance and a chaotic PRNG (not a fixed period). Xoroshiro128+ has acceptable quality but high performance.
An PRNG put together by me, XoroshiroMT, is also good quality and sits between PCG and Xoroshiro128+ qua performance. But now that there is just a new Xoshiro PRNG, it may be worth evaluating that one first, as it is said to also be good quality. CSPRNGs
For CSPRNGs we currently have two good implementations in Rand
- ChaCha20
- HC-128
ChaCha20 offers reasonably good performance and uses little memory, while HC-128 brings high performance at the cost of using much memory.
I would also like to see some implementation of AES-CTR DRBG eventually, as it is commonly used, and has hardware support on modern desktop processors. Other / deprecated PRNGs
We currently have the ISAAC, ISAAC-64 and Xorshift128/32 PRNGs in rand. They have no real advantages over the PRNGs metioned before. It is better to use HC-128 instead of ISAAC, and Xoroshiro128+ or PCG instead of Xorshift. We can include them in something like a deprecated module, but I propose to move them to stand-alone crates outside the Rand repository. Steps to take
- move the prngs module to a seperate rand_rngs crate in the Rand repository, similar to rand_core.
- move generators benchmarks over.
- lift PRNG implementations from https://github.com/pitdicker/small-rngs (I have updated it to master a while ago, but still have to push the changes)
- lift Xoroshiro128+ from https://github.com/vks/xoroshiro/
Does this sound about right?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/rust-lang-nursery/rand/issues/431, or mute the thread https://github.com/notifications/unsubscribe-auth/AACCtJsl20y6PoCkWCeYCSRhM0dRq4Pvks5tvv0ygaJpZM4T0Cak .
See the first post and the linked issue.
I actually see no reason that we shouldn't have one crate per RNG (or perhaps family of RNGs), but I think it's worth having these in the Rand repo and the crates owned by the Rand team (currently me and the libs team). Or we could try organising more and have a rust-rand
project and set of repositories on GH maybe.
Both the new xoshiro and xoroshiro** variants I think are worth consideration.
Vigna claims xoroshiro256** has a higher degree polynomial than PCG 128 variants https://www.reddit.com/r/programming/comments/8gx2d3/new_line_of_fast_prngs_released_by_the_author_of/dygkkng/, so xoshiro variants are probably even stronger.
Edit: poor wording
Is the separate repository is merely a transitional detail? It improves code size if any code that gets used actually gets exposed somewhere, even if only in another crate. It would improve tests if all this lives in the rand repository, right?
@TheIronBorn this issue isn't about what algorithms we include, it's just about organisation and presentation.
@burdges I'm not sure what you mean about transitional; no. The easiest option is I think just to have sub-projects in this repo (like rand_core
).
I'm not even sure if it does help with testing having everything in the same repo; in one way it makes it worse since CI must run all unit-tests on any change (although with API-breaking changes this may be necessary).
Once we have stable core APIs it may make more sense to use separate repos since most changes (docs, new features, tweaks to other features) will not affect other crates. I.e. once rand_core
is stable then changes to things like the distribution code will have no effect on the RNG implementations, so separate repos do make some sense.
In this issue and the previous one I have given arguments that it would really benefit users to have one crate, as it helps comparing and contrasting them. And real but smaller benefits for us. Are those reasons wrong?
Maybe it is good to also collect the reasons multiple crates can be a good idea.
Two related thoughts:
rand_*
crate.CI must run all unit-tests on any change.
The unit tests we have now for PRNGs run in a fraction of a second, easily a 100 times less than the set-up time for Travis.
There is precedence to have one crate per algorithm in RustCrypto. You can still have a crate reexporting them all, but I'm not sure how useful that is. It might make sense to have some comparison in Rand's documentation, where we advertise the recommended RNGs and their crates.
On Mon, May 7, 2018, 08:32 Paul Dicker notifications@github.com wrote:
In this issue and the previous one I have given arguments that it would really benefit users to have one crate, as it helps comparing and contrasting them. And real but smaller benefits for us. Are those reasons wrong?
Maybe it is good to also collect the reasons multiple crates can be a good idea.
Two related thoughts:
- most non-cryptographic PRNGs only need 5-10 lines for the algorithm. About 30 additional lines to implement the necessary traits. Then a handful of tests, documentation, license header, and benchmarks. Seems a bit small for a crate to me, but that is not unique.
- how do we show separate crates are similar enough to compare directly (benchmarks, quality, support)? Anyone is free to create a rand_* crate.
CI must run all unit-tests on any change.
The unit tests we have now for PRNGs run in a fraction of a second, easily a 100 times less than the set-up time for Travis.
— You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/rust-lang-nursery/rand/issues/431#issuecomment-386970760, or mute the thread https://github.com/notifications/unsubscribe-auth/AACCtKxNDruFXra7bvopRZ0gZTadbl2_ks5tv-qSgaJpZM4T0Cak .
So sounds like
rand_core
is)prng
module should moverand
depend on whichever crates are necessary for the StdRng
and SmallRng
algorithmsHow many crates do we want? Only 1-3 (e.g. crypto, non-crypto and deprecated)? One crate per family of RNGs? One crate per RNG?
@newpavlov do you have anything to say about the many-small-crates design of RustCrypto's hashes (and other projects)?
In my opinion small crates work pretty well (but not too small, i.e. crates for family of algorithms, not for each variation), yes there is some headache with coordinated upgrades to the new trait version, but if rand_core
will be relatively stable, I don't think it will be a big problem.
having many crates may make comprehensive testing and benchmarking painful
cargo test --all
and cargo bench --all
more or less solve this problem. There is an issue with features when you work with virtual manifest, but I don't think that RNG crates will suffer from it.
each crate has extra boilerplate (cargo.toml, readme, possibly tests and benchmarks)
RustCrpyto partially solves this problem by having feature gated dev
module in trait crates, which define functions and macros to make it easier to write tests and benchmarks. Though I am not sure if RNG crates should follow this approach.
I would like suggest to move CryptoRng
crates to the RustCrypto/CSRNGs repo. (of course with full access granted to @dhardy and lang team) Pros:
rand
crate, e.g. rdrand
(@nagisa gave his consent) and aesrng
developed by @vks Con is of course is the split of RNG crates between two organizations.
Actually we already stopped using --all
for most tests because of this issue. It's a shame it's not been solved.
It'd be nice to save code size by using stream cipher code for CSPRNGs of course, but not that pressing. At first blush, I kinda think moving CSPRNG crates into another project, especially one less blessed by Mozilla, sounds premature and distracting. At minimum, all the BlockRng business will make reading the code annoying.
With our current PRNGs and currently-proposed additions, a crate-per-family approach looks something like:
rand_rng_xorshift
rand_rng_xoroshiro
(or could be with xorshift; also can include xoshiro)rand_rng_pcg
rand_rng_sfc
rand_rng_isaac
rand_rng_chacha
rand_rng_hc
(could potentially also house HC-256
, if we have any use for it)Lets not worry about the generators for now (there will probably be more); this is several crates already, and some of them will be very small (pcg unless we add more algorithms, sfc, xorshift unless combined with xoroshiro; also hc and chacha are only 500 lines).
There is nothing inherently wrong with this, but it is a significant amount extra boilerplate just to save pulling unused PRNGs into a project (the entire rand::prng
module is only 2600 lines, albeit because we have avoided expanding this much until now).
Are there any alternatives worth discussing?
I would probably use shorter names and drop the rng_
, but other than that I think one crate per family is reasonable. I think this is nicer for other crates, because they can depend on the specific algorithm without having to worry whether it is currently used by Rand or not, and we can keep Rand minimal at the same time.
Just to be sure, the only real argument for having multiple crates is that at some point is becomes 'cleaner' for us to deprecate an RNG?
And it adds the problem of boilerplate, making the PRNGs harder to compare, more difficult to guide users, and anyone is free to make a crate with a seemingly official name.
I would like suggest to move
CryptoRng
crates to the RustCrypto/CSRNGs repo. (of course with full access granted to @dhardy and lang team) Pros:
- CSPRNGs usually tightly coupled with stream ciphers, so they could share underlying implementations, which will make easier to work on optimizations.
- I plan to have other CSRNGs in addition to algorithms from
rand
crate, e.g.rdrand
(@nagisa gave his consent) andaesrng
developed by @vksCon is of course is the split of RNG crates between two organizations.
I am afraid this would only cause more work, not less. While algorithms may be (mostly) the same, RustCrypto and Rand can have different goals. What happens when one decides some algorithm is fit for inclusion, but the other does not. Or that the trade-off for performance vs security should be different? We would have to have a very clear sense of direction and and idea of what both projects see in such a crate.
Now I am not saying we can't work together, and work together well. But at this point I think it is better for us if we keep this only in mind as an option, and rethink it next year or something like that. Rand has changed a lot in the past couple of months, and it seems like a good idea to me to remain flexible and get our story straight first. Especially the amount of security features to provide is something we have quite a few open issues for, and seems a bit fuzzy.
What happens when one decides some algorithm is fit for inclusion, but the other does not.
if it's officially published algorithm (i.e. not home-brew crypto) designed as CSRNG I will be happy to see it in the repo, so RustCrypto/CSRNGs is intended to be more inclusive than rand.
Or that the trade-off for performance vs security should be different?
Ehm, for CryptoRng
implementations priority should be security, otherwise it should not implement this trait.
Ehm, for
CryptoRng
implementations priority should be security, otherwise it should not implement this trait.
:smile: Maybe I should write a bit clearer about which trade-offs I had in mind. Nothing terrible I hope.
As an example, the BlockRng
wrapper (used by ChaChaRng
and Hc128Rng
) could choose to zero out a value after it was read from the buffer. That brings down the performance of the RNG by 10-20%. It would make it more difficult for someone peeking at the memory to recover previous values. But in my opinion, when someone can inspect the memory of a process, you have already lost. So we currently don't zero after reading a random number from a block of buffered results.
Another one I have in mind is fork protection. I hope we can add that to the ReseedingRng
wrapper soon. It means checking a global static and reseeding if it has changed. Should that happen on every iteration, or only when a new block of values has to be generated? Again that is the choice between 1-2% overhead, or 10-20%. I would say having something like 15 duplicated values after a fork (which is not supposed to happen) is an acceptable trade-off, especially because it means users have no good reason to turn off the protection.
Zeroing ChaChaRng output does nothing. If you can hammer the offset then you can hammer the block number, well they even live in the same struct.
Just to be sure, the only real argument for having multiple crates is that at some point is becomes 'cleaner' for us to deprecate an RNG?
What's the alternative — maintain a crate that may grow to a hundred or more PRNGs, or be extremely selective about which we include? Perhaps we won't include enough PRNGs for size to be an issue, but in that case what's the motivation for using an external crate in the first place?
The only other reason I can see for separate crates is to allow an app to use rand_core
+ some PRNG without importing much extra code, i.e. to minimise what needs to be reviewed.
What's the alternative — maintain a crate that may grow to a hundred or more PRNGs, or be extremely selective about which we include?
PractRand contains 200+ RNGs, and that is already a selection. In my opinion having that many is not useful. Even the opposite, too many choices distract and complicate, so having many choice would be bad.
What I had in mind (with the first post, and the relevant issues in your repro), was to have a small, well chosen selection. Not all that different from how we work now. Users that want more than StdRng
or SmallRng
should just be able to go to one crate and have a clear overview of good choices. With a selection of PRNGs that come from different families, have different performance, quality, memory usage, security and features. I don't see that become more than 20 choices.
My idea was not to have some super-crate or organization to implement every (possibly even obsolete) PRNG design under the sun. But to have a good resource for users.
And other crates can still have a role in providing more variants or features, like the current PCG crate.
@pitdicker What you are suggesting would be possible by reexporting the smaller crates in a "curation" crate. I would prefer this over duplicating code.
Note that a small selection might become not so small as algorithms become obsolete and get replaced but have to be kept for backwards compability.
In this case we have three options:
If we keep the number of choices low (e.g. 20 algorithms) then the first option is a good choice, though as @vks says this may grow over time (removing algorithms is not a good idea except perhaps if a CSPRNG is found to be insecure, because it causes problems for any libraries needing reproducibility with old algorithms and requires extra maintenance).
Also, no one has answered my other question yet: if we don't have per-family crates, is there any point at all moving the algorithms out of the main Rand crate?
if we don't have per-family crates, is there any point at all moving the algorithms out of the main Rand crate?
We could make breaking changes in the RNG crate and use it in Rand without making a breaking change there (if the RNG crate is not part of Rand's public interface).
We could make breaking changes in the RNG crate
Such as:
next_u32
followed by next_u64
has changed in current generators).rand_core
traits, which are unlikely to have breaking changes (and would require bumping the Rand version number anyway).Also using an external crate allows usage with rand_core
but without the main crate; however I'm not sure this is a big advantage, at least unless we choose to further modularise Rand.
Lacking further argument I propose we just add new PRNGs to Rand for now. Alternatively I'm not against adding a curation crate(rand_rngs
) or something, though in that case we would probably not want to add the ISAAC generators to the curation crate. Or adding family-crates as mentioned above.
Just to be sure, the only real argument for having multiple crates is that at some point is becomes 'cleaner' for us to deprecate an RNG?
It's easier to review a crate containing only a small number of similar RNGs (i.e. keep things like Xorshift and Hc128 separate), although I don't see a good reason that e.g. ChaCha and Hc128 shouldn't be in the same crate (other than lack of an obvious name).
Well, nothing has happened in the last month, yet I feel this issue is quite important.
I suggest we move forward as follows:
SmallRng
's implementation with something else (do not discuss which algorithm here!), keeping the implementation hidden. This gives us a better SmallRng
without dependence on any other crate or public PRNG in Rand.XorshiftRng
. If we can, we point them to another crate with a compatible implementation.We should move forward with some small(er) crates implementing families of PRNGs. On a case-by-case basis we decide whether to host these externally or within the Rand repository (which may or may not be considered to imply a higher level of review). (Alternatively we could start a "Rust Rand" organisation on GitHub; not sure that it's worth it though.)
The following PRNG crates already exist:
Related, external RNGs:
Quality and maintenance:
rand_core
0.2; all others use rand
0.3.x or 0.4.x if they depend on Rand at all. I don't know if this says we should be doing more to help other authors or that most of the above are basically abandoned; either way we can't really promote external RNG crates which don't support the latest Rand.Therefore I think we should slightly prefer internal implementations of PRNGs in our documentation and be open to pull-requests to add PRNGs here, but not require it.
Promoting RNGs: I'm not convinced at this time it's worth having a "curation crate", though we could do so. A better option might be to add a document to Rand reviewing various RNG crates (there are a few other things which could go in a Rand "book").
Organisation:
Edit: we should probably just use the crate names without subdirectory.
~Suggestion: we add an rngs
directory, then any sub-crates in this repo can go under here (e.g. REPO_ROOT/rngs/xoshiro
)~
We currently have only two implementations, Hc128Rng
and ChaChaRng
. For now I suggest we leave them in Rand's main crate, though we could move to rngs/csprng
(rand_csprng
crate) later.
Since the ISAAC generators are no longer used by ThreadRng
, lets move them out to rand_isaac
crate/subdir).
This would be a good time to rename PRNGs, if appropriate.
Currently all internal PRNGs have a Rng
suffix, e.g. XorShiftRng
. Some other implementations follow this convention; some don't. To me it seems redundant to use within a crate of RNGs. Therefore we could rename e.g. IsaacRng
to Isaac32
or just Isaac
.
Crate names: I suggest that all crates created as part of the Rand project use the prefix rand_
(following the convention of rand_core
). I'm not sure whether we should make a rename when adopting an external crate into the Rand project; possibly we should do so but also with a compatibility release under the old name.
Does this sound like a good plan?
I agree with your plan, it is almost exactly what I would suggest as well.
A better option might be to add a document to Rand reviewing various RNG crates (there are a few other things which could go in a Rand "book").
In addition, I would recommend to use specific RNGs and their crates for reproducibility. At the same time, we could explicitly weaken the reproducibility guarantees for SmallRng
and StdRng
, so we can replace the algorithms in a feature release.
About the pcg_rand
crate: I recently tried to port it to rand_core
0.2, which was surprisingly tricky due the heavy use of generics (similar to the reference C++ implementation). I ran into a compiler bug and gave up, but it might be possible to redesign the library to work around the bug (see https://github.com/robojeb/pcg_rand/issues/8).
The reproducibility limitations already apply to SmallRng
and StdRng
(maybe doc could be improved).
I also thought the use of generics in that crate is a bit over the top (the same criticisms apply to @imneme's libraries — most users want one relatively simple PRNG; not metacode for building hundreds of variants of generators). So I'd rather start from @pitdicker's code.
I added a tracker at the top of this issue.
@vks do you want to keep your xoshiro crate external or move it into this repo? I really don't care either way; externally you get more freedom but also the expectation of maintenance :wink:
Also, what should we do with Xorshift
? It's not especially good, but we should probably keep it around somewhere in case anyone wants to reproduce results with it. @astocko has an xorshift
crate but it needs updating, and doesn't have the same generator.
On another point: do we want to continue with only one repo or have multiple? Already we have some confusion of branch & tag names with 0.5
referring to the rand
version but also hosting rand_core
0.2.x.
Separate repos would give us more git and CI overhead but smaller history and less CI work to do on each change.
@dhardy In principle, I'm fine with either way. It depends on what we decide about the fate of PRNGs in Rand: If Rand somehow depends on a PRNG, I think the crate should be in the Rand repository. If Rand just recommends the crate, either way is probably fine. Or should we be worried that a crate we recommend gets compromised? After all, Rand is the third-most popular crate. However, I anticipate the recommended crates will be used by few users, making them a smaller target.
If we decide to give the recommended crates a rand_
prefix, they should probably live in the Rand repository. This would probably the the easiest way to move xorshift
and isaac
out of Rand, for which crates already exist. Similarily, we could add rand_pcg
, rand_xoshiro
etc. (Unfortunately there already is a pcg_rand
crate, which might be confusing.)
About splitting repos: I'm not sure. It worked well for num
, but there the crates are more orthogonal. I would expect there are few issues that only apply to rand_core
.
For non-crypto PRNGs we only need one in Rand (SmallRng
) and there isn't a lot of code involved, so we could just copy the code. For CSPRNGs, yes, we should only depend on Rand crates (or perhaps a RustCrypto crate).
I think we should use the rand_
prefix for Rand sub-projects but not for external crates, and if we adopt an external crate into Rand we can rename the crate then (it's not hard for users to switch this, and avoids stepping on the toes of inactive authors).
True, at the moment most issues are strongly linked to the main Rand project even if about some other part such as a rand_core
trait or another distribution implementation. Maybe we should reconsider later, but for now continue with one repo.
Would a pull request for a Xorshift1024 crate be accepted at this stage? I'm wanting to have xorshift1024* for a monte carlo simulation I'm writing, and it looks like moving forward having it hosted in this repository would be ideal (and match with the plans outlined here). I'd open a separate issue, but it seems slightly subsumed by this one.
You mean this xorshift1024*φ
algorithm?
Given that there are a couple generations of Vigna's generators which more or less obsolete the algorithm, I'd say it's unlikely it would be included.
We are considering adding this xorshift crate which has xorshift1024* generators.
xoshiro512
or xoroshiro1024
would be more likely. The xoshiro
crate has xoshiro512
and xoroshiro1024
generators could be added. (xoroshiro1024
is described only in the Scrambled Linear Pseudorandom Number Generators paper)
It's also possible that other generators with similarly large periods might be added.
Note: xorshift4096*
also exists. Probably impractical but it's curious. (not sure why it's no longer listed)
Well, I'll try again: @astocko are you interested in maintaining your Xorshift crate?
But given the lack of commits in the last year it seems unlikely. The code is CC0 so it may be sensible at this point to pull out what we want and make a new rand_xorshift
crate.
@vks what do you think should be the divide between Xorshift, Xoroshiro and Xoshiro generators, since you already have the latter two in your crate?
I see you mentioned me here as an author of randomorg crate. I am the one and I just wanted to remind you that this crate unfortunately is not going to have Rng
traits implementations due to its nature of remote data source.
@dhardy My crates are based on the recommendations by Vigna. The xoroshiro crate includes xorshift1024*phi, because that was recommended together with the xoroshiro generators. The xoshiro crate implements the new generation of Vigna's generators, which includes some xoroshiro variants incompatible to the xoroshiro crate (!). So yes, the crate names are a bit misleading, but I think the current separation makes the most sense. A more accurate name would be rngs_vigna2018, but I'm not sure it would be a better name.
I should probably clarify that in the docs.
On Sat, Jul 14, 2018, 09:52 Diggory Hardy notifications@github.com wrote:
Well, I'll try again: @astocko https://github.com/astocko are you interested in maintaining your Xorshift crate?
But given the lack of commits in the last year it seems unlikely. The code is CC0 so it may be sensible at this point to pull out what we want and make a new rand_xorshift crate.
@vks https://github.com/vks what do you think should be the divide between Xorshift, Xoroshiro and Xoshiro generators, since you already have the latter two in your crate?
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/rust-lang-nursery/rand/issues/431#issuecomment-405006688, or mute the thread https://github.com/notifications/unsubscribe-auth/AACCtBdWGQeAOE_GE66fWNGjJW8V1cJjks5uGaNFgaJpZM4T0Cak .
We could just throw it all under rand_vigna
, but a separate crate for Xorshift is also fine. His recommendations have changed over time, so putting this years' recommendations under vigna2018
doesn't seem sensible for what we hope will become a stable crate.
Well, stability is why I decided to create a new crate, so the old one can still be used for reproducibility. One of the advantages of moving the algorithms out of Rand it that we can just add and drop crates from the recommendations without breaking anyone's code.
Hey, author of the Pcg32 crate here. The other PcgRand crate seems much more full-fledged than mine, but I'd be happy to continue maintaining Pcg32.
I personally would advocate for some sort of vetting/benchmarking with the rand crates. Even something like dieharder scores would be nice to see.
We currently have quality comparisons here: https://docs.rs/rand/0.5.4/rand/prng/index.html#basic-pseudo-random-number-generators-prngs
I don't know if we've thought about listing actual test data though (probably TestU01 or PractRand if anything)
@droundy if you're still interested, we'd accept Xorshift1024 in the new rand_xorshift
sub-crate (see #557).
@pitdicker I think it's time to turn your small-rngs
repo into a real crate :smile:
As you say, it is more useful having a small crate with a few high-quality PRNGs than to have a huge collection, so how about we create rand_small_rngs
with the following guide-lines:
Potentially we can add a separate rand_rngs
sub-project for comparison purposes (benchmarks, quality testing). These should be separate: the former is a small collection of high-quality RNGs not hosted elsewhere (and should remain small); the latter is a comparison across many RNGs (and could host the generators.rs
benchmarks later).
I've just gone ahead and implements a xoroshiro256+ in my own code, which ended up seeming simplest, after I decided that compatibility with my old C code didn't matter. Maybe I'll use a "standard" xoroshiro implementation at some later point, but for now it seemed easiest to just translate the C code myself. That way I didn't have to mess with serde feature flags, and I could easily get proper seeding with a u64 as recommended by Vigna.
If it helps you decide which PRNGs to include where, I have written an article on random number generators and categorized "good" RNGs into two categories. Essentially:
Any RNG other than a cryptographic or statistical RNG, as defined in my article, should be considered low-quality, I think.
I updated my pcg crate to implement the RngCore
/Rng
traits
Edit by dhardy: adding a tracker for the planned tasks:
SmallRng
algorithm (see https://github.com/dhardy/rand/issues/60 and https://github.com/dhardy/rand/issues/52)XorshiftRng
(required: that the RNG is available in another crate)rand_rngs
,rand_pcg
or something)Also: we should consider whether we want to rename PRNGs when moving or adding them.
As decided in https://github.com/dhardy/rand/issues/58.
The idea is to only keep the
StdRng
andSmallRng
wrappers in Rand, and to not expose PRNG algorithms directly.The field of PRNG algorithms is not static, it seems like a good idea to cut Rand partly loose from the developements to improve our backward-compatability and usability story. Advantages (from the original issue)
Instead we want to provide an official companion crate,
rand_rngs
, with a couple of blessed implementations.Whether Rand should depend on
rand_rngs
or copy the two algorithms used byStdRng
andSmallRng
is not fully decided. It might reduce code size if a crate depends on bothStdRng
and its algorithm (currentlyHc128Rng
) directly, but that seems rare and not worth much. It has the disadvantage that both implementations should remain in sync. Maybe the CI can check this?If Rand depends on
rand_rngs
, it would restrictrand_rngs
to only depend onrand_core
. But that is whatrand_core
is for. I am for having Rand depend onrand_rngs
.rand_rngs
rand_rngs
should provide a good selection of PRNGs.Including multiple PRNGs in one crate has the advantage of having a single place to provide guidance, as it helps comparing different algorithms with each other. Also it gives one place to offer consistent benchmarks, to run PRNG test suites, to keep up a common quality level, and possibly to develop functionality that is useful for more than one PRNG (e.g. jumping).
Which PRNGs to include has been the topic of endless discussions. A few are basically decided. I fully expect the number of PRNGs to grow over time. But we also should be careful not to expose too many, as there are hundreds. Every PRNG should have one thing that gives it a clear advantage over others, otherwise it is not worth the inclusion.
Normal PRNGs
https://github.com/dhardy/rand/issues/60 explored normal PRNGs. The following 5 I feel comfortable about for an initial version of
rand_rngs
:The two PCG variants offer good quality and reasonable performance. SFC provides high performance and a chaotic PRNG (not a fixed period). Xoroshiro128+ has acceptable quality but high performance.
An PRNG put together by me, XoroshiroMT, is also good quality and sits between PCG and Xoroshiro128+ qua performance. But now that there is just a new Xoshiro PRNG, it may be worth evaluating that one first, as it is said to also be good quality.
CSPRNGs
For CSPRNGs we currently have two good implementations in Rand
ChaCha20 offers reasonably good performance and uses little memory, while HC-128 brings high performance at the cost of using much memory.
I would also like to see some implementation of AES-CTR DRBG eventually, as it is commonly used, and has hardware support on modern desktop processors.
Other / deprecated PRNGs
We currently have the ISAAC, ISAAC-64 and Xorshift128/32 PRNGs in rand. They have no real advantages over the PRNGs metioned before. It is better to use HC-128 instead of ISAAC, and Xoroshiro128+ or PCG instead of Xorshift. We can include them in something like a
deprecated
module, but I propose to move them to stand-alone crates outside the Rand repository.Steps to take
prngs
module to a seperaterand_rngs
crate in the Rand repository, similar torand_core
.generators
benchmarks over.Does this sound about right?