Replace `convert_slice_{32,64}` with `read_u{32,64}_into`

pitdicker commented 6 years ago

And a little cleanup around the init functions

pitdicker commented 6 years ago

I am not sure what you mean with feature creep.

I know from_rng is often used with fresh entropy, but it can be used to seed from a deterministic master PRNG, so I think it should be portable. Agree?

I would argue that we should be able to rely on from_rng to get actual random numbers that are not reproducible, especially for cryptographic RNG's like here. And if someone wants reproducible results from_seed should be used, from_rng is the wrong API.

But making from_seed endian-independed should not be difficult, and we can just as well do it.

I played with from_rng calling from_seed. It would do some unnecessary (unaligned) copying, but that should not really have measurable impact. For ISAAC we use much more bytes as seed with from_rng than with from_seed, so there it would not work. Iterating over the seed with to_le in from_rng is maybe the easiest option.

dhardy commented 6 years ago

I suppose we can just document that from_rng is not portable, so SubRng::from_seed(master_rng.gen()) or similar should be used.

Of course both from_seed and from_rng can be used with fresh entropy to get non-reproducible numbers. As @burdges has already pointed out, the two are redundant, apart from portable reproducibility and optimisation, so I guess you are right.

pitdicker commented 6 years ago

Thinking some more about it, a function that just works without pitfalls is worth more than a bit simpler implementation, or optimization on uncommon architectures (which are probably unmeasurable small for from_rng).

I will update the from_seedimplementations this evening or tomorrow, and add tests for it.

What did you mean with feature creep?

dhardy commented 6 years ago

My comments about requiring from_rng to be portable is feature creep; we didn't require this before.

If we make from_rng endian-fixed then I wonder why we should have it at all (it should be possible to write MyRng::from_seed(rng.gen()) eventually I think). The other point is something I mentioned before: making non-portable RNGs like StdRng implement SeedFromRng but not SeedableRng; if both methods are portable then this decision no longer makes sense. So maybe we should just drop from_rng entirely?

Currently PRNGs implement Rand. We could potentially implement this (or a distribution which then implements Rand) automatically for anything implementing SeedableRng (though this is a breaking change since existing Rand implementations must be removed).

pitdicker commented 6 years ago

from_rng has a few advantages:

Xorshift and similar RNG's that don't work with some seeds can just ask for a new one.
Being portable does not make StdRng and other wrappers any more deterministic, it is still possible to swap the wrapped RNG. Not implementing from_seed helps make this clear a little more.
In try_fill we often have to do a copy from the output of an RNG to the potentially unaligned byte buffer. From this buffer it is copied (unaligned) into a seed in from_seed. from_rng avoids the extra copies. With a little effort LLVM may be able to optimize this out though.
from_rng can use a different seed length like it does now in ISAAC. And some optimizations are possible, like skipping an init function and filling the state directly. Although that may not be a good idea.

But I agree this are minor advantages.

I leave the call to you. Shall I leave from_rng as it is? Make it endianness-independend? Or merge it into SeedableRng and give it a default implementation?

dhardy commented 6 years ago

No, your arguments make sense. Aside from one thing: surely the seed buffer passed to try_fill will be aligned? If need be you could enforce the alignment yourself by creating it with word-sized elements then casting to a byte array.

By portable I mean produces the same results everywhere, including in future versions.

pitdicker commented 6 years ago

surely the seed buffer passed to try_fill will be aligned

But is LLVM able to figure that out? If the functions are not inlined it has no choice but use slower unaligned copy functions. But that is something to try and play with, probably we can figure something out. Or just measure the impact, which may be tiny.

pitdicker commented 6 years ago

I have added a commit that folds from_rng into the SeedableRng trait.

With some extra bounds on SeedRestriction it was possible to give from_seed a default implementation that uses from_seed. Performance optimizations for it is something I have not yet looked into, maybe not even worth it.

In Xorshift from_seed no longer panics, but replaces an all zero seed with 0xBAD5EED as discussed in https://github.com/dhardy/rand/issues/67. It still has a custom from_rng method to request new values if they are all zero (as unlikely as that is). That also now uses try_fill instead of next_u32 so it can pass on errors.

In Isaac{64} I kept the custom try_fill that fills the whole state, and made it endian-independent.

dhardy commented 6 years ago

Hang on, weren't you just arguing that from_rng should be kept separate?

pitdicker commented 6 years ago

I though you were thinking in the opposite direction, and I saw these advantages only as 'minor' 😄. How do you think it turned out? We do not seem to lose much, because it is possible to override the default implementation.

dhardy commented 6 years ago

The most significant change is that StdRng must now implement SeedableRng which means there's no longer that "documentation" that the PRNG shouldn't be relied on to be reproducible. But both ways it's a minor thing really.

With specialisation we should be able to do a default implementation of from_rng even if the traits are separate, if we want to do that. I'm undecided.

pitdicker commented 6 years ago

Yes, true. I don't think it will cause big problems though. There is good documentation, requiring deterministic numbers is (I think) not that common, and the fix is to just import the right RNG. And now we have one less trait for users of rand to think about.

dhardy commented 6 years ago

In that case, is there even much point in having from_rng at all? I guess it depends whether we can make X::from_seed(rng.gen()) work. I removed the code for generating fixed-size arrays with Rand early in my branch, but there's no reason it can't be added back (for distributions::Default), maybe for all array sizes up to 32 plus some more power-of-two sizes.

This brings up another point, the rationale for from_rng was initially to replace Rand implementations to prevent naive users from writing rng.gen() where rng is XorShiftRng or another weak RNG with potential accidental-copy issues. Assuming we actually replace XorShiftRng we don't need to worry so much about this. On the other hand, I suppose from_rng can now require a CryptoRng source while leaving it easy for users to seed from a non-crypto RNG if desired.

Note: almost all of this still needs merging into upstream, and needs some rationale.

pitdicker commented 6 years ago

almost all of this still needs merging into upstream, and needs some rationale.

And from_rng is a new method, so be must be able to 'sell' it :smile:.

I think we can say from_rng is not absolutely needed. But that you added it feels like the right call to me. It already gives minor advantages for Xorshift and ISAAC.

I am not sure if this is the right way to think about it, but to me the available methods in an API tell me how it is supposed to be used.

How do I initialize this PRNG? Seed from another RNG with from_rng.
How do I seed this PRNG in a reproducible way? Use from_seed.
Are reseeding frequently or custom reseeding algorithms important? We don't really consider it to be, so reseed is not part of the core API, but a wrapper.
How do I use this ergonomically? NewSeeded provides a new method around from_rng. A from_hash method helps to make from_seed easier to use.

Not checked yet, but X::from_seed(rng.gen()) does not really work well with our new error handling, right? Not unsolvable of course, just add a try_gen method. Is it fair to say gen is more suited or even intended for PRNG's, not really as an interface for things like OsRng?

dhardy commented 6 years ago

Good point about the error handling. All the distributions and Rand code assumes infallible RNGs; i.e. panic or best-effort on error, so X::from_seed(rng.gen()) should be workable (with a few changes) but can't do proper handling like XorShiftRng::from_rng(rng) might retry on bad seed.

Adding rng.try_gen()? would require fallible distributions — implementing it for byte-array types would be easy enough, but e.g. generating floating-points fallibly would require duplicating a lot of code (or abstracting over error handling), and generating integers would really need try_next_u32 (we never quite agreed we don't need that... grimace).

Ok, so we do want from_rng after all.

dhardy / rand

Replace `convert_slice_{32,64}` with `read_u{32,64}_into` #77