Restrict `BlockRngCore::Item` type?

dhardy commented 6 years ago

@burdges had an idea, looking something like:

pub trait BlockRngCore {
    /// Results element type, e.g. `u32`.
    type Item: details::BlockRngItem;

    ...
}

mod details {
    /// Item types supported by `BlockRng`.
    pub trait BlockRngItem {}
    impl BlockRngItem for u32 {}
    // later also impl for u64
}

This should allow the impl of RngCore for ReseedingRng to be properly generic:

impl<R, Rsdr: RngCore> RngCore for ReseedingRng<R, Rsdr>
where R: BlockRngCore + SeedableRng { ... }

But there's a catch I haven't found a solution for: if we do

impl<R: BlockRngCore> RngCore for BlockRng<R> { ... }

then Rustc doesn't understand that there are any bounds on the Item type, making the implementation impossible; if instead we do

impl<R: BlockRngCore<Item=u32>> RngCore for BlockRng<R>
where <R as BlockRngCore>::Results: AsRef<[u32]> { ... }

then the generic impl for ReseedingRng doesn't work.

It seems to me that Rustc needs more flexible handling of generics to give us a solution here?

dhardy commented 6 years ago

This doesn't work:

impl<C: BlockRngCore> From<<C as BlockRngCore>::Error> for Error {
    fn from(e: <C as BlockRngCore>::Error) -> Error {
        e.into()
    }
}

yields:

error[E0207]: the type parameter `C` is not constrained by the impl trait, self type, or predicates
   --> rand-core/src/lib.rs:225:6
    |
225 | impl<C: BlockRngCore> From<<C as BlockRngCore>::Error> for Error {
    |      ^ unconstrained type parameter

The error surprises me, but it's not really want we want anyway.

dhardy commented 6 years ago

What am I missing? The above impl I tried is stupid; instead one should use impl<E: Into<Error>> From<E> for Error {...} but of course std already provides this (in fact the compiler tells me so if I try implementing it). Yet when I use ? on generate I am told:

error[E0277]: the trait bound `error::Error: std::convert::From<<R as BlockRngCore>::Error>` is not satisfied
   --> rand-core/src/impls.rs:293:13
    |
293 |             self.core.generate(&mut self.results)?;
    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `std::convert::From<<R as BlockRngCore>::Error>` is not implemented for `error::Error`

Is this perhaps just the compiler's type-constaint prover giving up?

burdges commented 6 years ago

Interesting problem, but.. I suggested BlockRngItem because I thought you needed methods in BlockRngItem, like to_le. If not, then I do not understand the issue with

impl<R, Rsdr: RngCore> RngCore for ReseedingRng<R, Rsdr>
where R: BlockRngCore + SeedableRng { ... }

Also I'd expect impl<R: BlockRngCore> RngCore for BlockRng<R> { ... } needs similar methods for the Item, no?

burdges commented 6 years ago

At least two more options: Just use macros for any generic impls that require the Item type. Add the helper trait on the opposite "ugly" side of BlockRngCore:

pub trait BlockRngCore {
    type Item: details::BlockRngItem; 
    ...
}

trait BlockRngItem : BlockRngCore {
    const SIZE : usize;
    fn to_le(item: BlockRngCore::Item) -> [u8; SIZE];
    ...
}
impl<R: BlockRngCore<Item=u32>> BlockRngItem for R {
    ...
}

dhardy commented 6 years ago

The impl for ReseedingRng isn't a problem except that the compiler doesn't recognise that BlockRng implements RngCore in general if the impl has restrictions on the item types (even if it covers all possible item types).

The impl for BlockRng is currently written with a restriction on the item type; removing that is the problem. next_u32 extracts a u32; this is easy to implement for any Item: Into<u32>. next_u64 of course needs two items for u32 arrays but only one for u64 arrays. Maybe it's possible to write a good implementation, but I don't think it's easy.

burdges commented 6 years ago

Is there a branch I can play with that's fairly close to this already?

Alsi I think my "ugly" version does not really help much.

dhardy commented 6 years ago

The current master, look in the impls module.

burdges commented 6 years ago

I'm suspicious about all these manual optimizations, like maybe some lighter weight optimizations achieve the same. I cannot point to anything concrete right now though.

I worried slightly about what the correct endianness behavior should be here. We've some PRNG whose implementation is defined internally in terms of a [u32] but whose interface is defined in terms of a [u8]. Anyone implementing this PRNG should obey the the endianness conversion defined by the PRNG of course. We can however simply say BlockRng is only for code that internally used LE conversions.

It appears index counts u32s? I'm not overly worried about next_u32 skipping a few bytes after a fill_bytes, but if a PRNG uses u64s then presumably next_u32 should consume only half of one, which gets tricky.

burdges commented 6 years ago

Is ReseedingRng restricted to to BlockRng only to skip one if check for reseeding? Ugh.

In my opinion, you should strip out this ReseedingCore business entirely. It adds complexity, reduces utility, and makes it harder to program generically. We have specialization for exactly this scenario. If specialization is not stable enough, then you could punt on the BlockRng optimization until it stabilizes, and maybe play with it behind a nightly feature.

burdges commented 6 years ago

Anyways..

I'd say drop the BockRngItem bound from BlockRng::Item but include the bound on any impls it simplifies. We must export BockRngItem from rand-core anyways because it must be used inside rand, which simplifies things.

I suspect BlockRngItem should replace all your endianness macros in rand-core too, and this support your manual optimizations, but not sure exactly how right now, but maybe starting with

pub trait BlockRngItem {
    #[cfg(endian = "little")]
    fn as_byte_slice(items: &[Self]) -> &[u8];
    #[cfg(endian = "big")]
    fn copy_bytes_le(items: &[Self], &mut [u8]);
}

dhardy commented 6 years ago

Good point about Endianness; we say in the doc that PRNGs can define this in their impls, but all the helper functions and BlockRng are strictly LE. It would be nicer not to have to parameterise them all over Endianness.

I'm not sure how next_u32 should be implemented for u64 slices. In some instances @pitdicker used a half_used boolean flag.

Yes, ReseedingCore is to save checking whether reseeding is required on every draw from the RNG; @pitdicker found significant performance improvements because of this. Implementing this via specialisation makes sense except that ReseedingRng is mostly only useful for crypto-RNGs anyway which are often block-based. Also I'm not sure we can implement it with specialisation — the method which needs the specialised implementation is fn generate which is a method of BlockRngCore not RngCore (which is why ReseedingRng has a complicated structure now).

dhardy commented 6 years ago

Talking about specialisation, it might be possible to get very radical and redefine RngCore:

pub trait RngCore {
    type Results: details::ResultRestriction;
    type Error: Display + Into<Error>;
    type Buffer: details::BufferRestriction;
    fn generate() -> Result<Self::Results, Self::Error>;
}

(where Buffer may be zero-sized to disable buffering completely, and buffering is managed by a wrapper like BlockRng).

It might be possible to implement all generators using this interface with similar optimisation to what we have today — except that OsRng reads the requested result size directly from the OS, instead of reading a fixed number of bytes. It would be a huge change however, not one I really want to explore now.

burdges commented 6 years ago

In fact, we've over constrained ourselves here because we imagine the constraints go roughly: We should generate directly into the user supplied buffer, not allocate our own, whenever the buffer is long enough. We cannot however trust the user supplied buffer has length a multiple of 4 or 8, thus preventing us from running to_le across the entire user supplied buffer.

At first blush, these sound contradictory but actually they present no problem because Results itself will always be long enough to run to_le across the buffer. We could maybe eliminate Item and provide directly the functions required for Results:

pub fn reserve_mut<'heap, T>(heap: &mut &'heap mut [T], len: usize) -> &'heap mut [T] {
    let tmp: &'heap mut [T] = ::std::mem::replace(&mut *heap, &mut []);
    let (reserved, tmp) = tmp.split_at_mut(len);
    *heap = tmp;
    reserved
}

pub trait BlockRngResults : Sized+Default {
    fn reserve_results_from_byte_slice(heap: &mut &'heap mut [u8]) -> Option<&'heap mut Self> {
        let len = mem::size_of::<Self>();
        if heap.len() < len { return None }
        let r = reserve_mut(heap,len);
        unsafe fn as_results<T>(slice: &mut [u8]) -> &mut Self {
            &mut *(slice.as_mut_ptr() as *mut Self)
        }
        Some(unsafe { as_results(r) })
    }
    fn set_endianness(&mut Self);
    fn copy_results(&self, &mut [u8]) -> usize;  // Or whatever
}
impl<RR: Deref<[u32]>+DerefMut> BlockRngResults for RR {
    // We provide a default implementation that assumes the PRNG favors little endian systems,
    // but PRNGs optimized for big endian systems can specialize this.
    // Any calls to `set_endianness` should be optimized away on systems of matching endianness. 
    default fn set_endianness(&mut self) {
        for x in self.deref_mut() { *x = x.lo_le(); }
    }
}

We could even fold BlockRngResults into BlockRngCore of course. We could also demand that generate does set_endianness itself.

burdges commented 6 years ago

I think generate will always fill the supplied Results, so actually it does not require specialization per se. It just always does the right thing.

It's ReseedingRng whose fill methods must split_at_mut the user supplied buffer at its next reseediing point.. or more likely reserve_mut at the next reseediing point in a loop.

burdges commented 6 years ago

How about this?

pub fn reserve_mut<'heap, T>(heap: &mut &'heap mut [T], len: usize) -> &'heap mut [T] {
    let tmp: &'heap mut [T] = ::std::mem::replace(&mut *heap, &mut []);
    let (reserved, tmp) = tmp.split_at_mut(len);
    *heap = tmp;
    reserved
}

pub trait BlockRng {
    type Results: Sized+Default+Deref+DerefMut;
    fn reserve_results_from_byte_slice(heap: &mut &'heap mut [u8]) -> Option<&'heap mut Self::Results> {
        let len = mem::size_of::<Self::Results>();
        if heap.len() < len { return None }
        let r = reserve_mut(heap,len);
        unsafe fn as_results<T>(slice: &mut [u8]) -> &mut Self::Results {
            &mut *(slice.as_mut_ptr() as *mut Self::Results)
        }
        Some(unsafe { as_results(r) })
    }

    /// ...
    fn generate(&mut self, &mut Results);

    /// Call after `generate` 
    fn set_endianness(&mut Self::Results);
}
impl<RR: BlockRngCore> BlockRngCore for RR {
    // We provide a default implementation that assumes the PRNG favors little endian systems,
    // but PRNGs optimized for big endian systems can specialize this.
    // Any calls to `set_endianness` should be optimized away on systems of matching endianness. 
    default fn set_endianness(&mut self) {
        for x in self.deref_mut() { *x = x.lo_le(); }
    }
}

dhardy commented 6 years ago

I don't want to make big changes now if it's not necessary. Is there anything you think we should definitely do before the next release? See https://github.com/rust-lang-nursery/rand/issues/232#issuecomment-373934991

dhardy commented 5 years ago

IMO it makes little sense to further investigate this before Specialization is available. Of course by then we will probably have hit 1.0 and not want major changes (though we don't really want them now either: we have been accused of excessive API churn multiple times.

Therefore, and since this has seen little interest in over a year, I think it unlikely we would ever land these suggestions (at least before 2.0) and best to close the issue, though of course further contributions are welcome.

rust-random / rand

Restrict `BlockRngCore::Item` type? #305