Distribution sample prototype and sized-ness of `Rng`

dhardy commented 6 years ago

Currently we have:

pub trait RngCore { ... }
pub trait Rng: RngCore { ... }

impl<'a, R: RngCore + ?Sized> RngCore for &'a mut R { ... }

pub trait Distribution<T> {
    fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> T;
}

impl Distribution<f64> for Exp {
    fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> f64 {
        let n: f64 = rng.sample(Exp1);
        n * self.lambda_inverse
    }
}

In my ~master~ experimental branch the Distribution trait is slightly different:

pub trait Distribution<T> {
    fn sample<R: RngCore + ?Sized>(&self, rng: &mut R) -> T;
}

impl Distribution<f64> for Exp {
    fn sample<R: RngCore + ?Sized>(&self, rng: &mut R) -> f64 {
        let n: f64 = rng.sample(Exp1);
        n * self.lambda_inverse
    }
}

Are they equivalent? Almost, but:

~What does R: Rng + ?Sized mean when Rng: Sized and why does this work?~ The constraint is relaxed.
~Why does Uniform.sample(&mut r) where r: &mut RngCore work in the latter case? This does work in both cases which is the main thing.~ Rng is implemented for every RngCore so effectively r: &mut Rng.
It is necessary to import both Rng and RngCore to implement Distribution in the latter
~It is often necessary to write mut rng when implementing Distribution in the latter~

Are there any subtleties I'm missing? The former is slightly more convenient, but I don't understand it as well as the latter. @burdges do you understand this better?

burdges commented 6 years ago

You want pub trait Rng: RngCore + ?Sized { .. }, no? Afaik, there is never a reason to write + Sized because rust assumes + Sized unless you say + ?Sized. I have not looked up the rules recently though. We should retain support unsized Rngs like the current crate if possible.
Are you actually using the trait object &mut RngCore? If so, that sounds fishy. I'm not too surprised if &mut rng works when rng: &mut R and R: RngCore but see 4 below.
I'm missing some context to answer this one perhaps, but presumably it's bouncing some call through Rng that you imagine should go through RngCore.
I think your mut rng: &mut R arises because you call rng.sample(Exp1) which takes a &mut self. It'll likely disappear if you call Exp1::sample(rng) or if you do (*rng).sample(Exp1).

burdges commented 6 years ago

I think 4 sounds like a rustc bug for which we should minimize and raise an issue. It's clear &mut R provides a &mut self but maybe it's picking the wrong impl for whatever reason.

dhardy commented 6 years ago

All methods on Rng require Self: Sized — I thought putting this restriction on the trait itself made more sense, but maybe not
Yes. Why is that fishy?
Actually I expected everything to go through Rng; we discourage using any of the RngCore methods directly
That might explain why I was confused about needing mut rng.

Thanks. But do you recommend R: Rng + ?Sized or R: RngCore + ?Sized in methods, or doesn't it matter? In the case of SeedableRng::from_rng we have to use RngCore (crate separation) but for the distributions we could use either.

burdges commented 6 years ago

I donno.. At this point, you've written more distributions than anyone, so did you prefer writing them with the higher or lower level method? :) It might not matter much actually if distributions will mostly build directly on other distributions.

I've no idea why Sized appears everywhere in both the original rand and your redesign, but rand still passes all tests if you remove them : https://github.com/burdges/rand/tree/unsized

It's possible the Sized restriction merely forces using the impls for &mut R: Rng, which costs nothing if the compiler removes all the double references. If so, I've no idea why that helps. I did not compare benchmarks or compile times with the Sized bounds removed.

It's also likely almost all Rngs are all Sized so the Sized restriction never actually comes into play.

pitdicker commented 6 years ago

But do you recommend R: Rng + ?Sized or R: RngCore + ?Sized in methods, or doesn't it matter?

It works, but using Rng in the distributions code bugs me a bit. It creates a kind of circular dependency between Rng and the distributions, although one distribution is always implemented using another so it works out.

dhardy commented 6 years ago

True; conceptually fn sample doesn't belong in Rng, but it is convenient. I'm not quite sure about it; also fn shuffle (in my master branch you write v[..].shuffle(&mut rng); instead).

dhardy commented 6 years ago

Contrast (trait from my branch but similar to Distribution::sample case):

pub trait Choose<T> {
    fn choose<R: RngCore>(self, rng: R) -> Option<T>;
}

    #[test]
    fn dyn_dispatch() {
        let mut r: &mut RngCore = &mut ::test::rng(813);
        // here we take a second reference to r only so that we don't consume:
        assert_eq!([7, 7][..].choose(&mut r), Some(&7));
        // ... (including more uses of r)
    }

with:

pub trait Choose<T> {
    fn choose<R: RngCore + ?Sized>(self, rng: &mut R) -> Option<T>;
}

    #[test]
    fn dyn_dispatch() {
        let r: &mut RngCore = &mut ::test::rng(813);

        assert_eq!([7, 7][..].choose(r), Some(&7));
        // the above does not consume r ...
   }

Both support usage of type-erased RngCore. The first requires a second level of reference to r formally (I don't know whether it gets optimised away; see https://github.com/rust-lang/rfcs/issues/1403 for why we can't just copy the reference), the second presumably doesn't.

I guess the conclusion is that the first version is nicer to write the impls for, but the second is nicer to use, and avoids the &mut reborrow problem, so should be our choice (which is a shame; if &mut R were copy but with proper lifetime tracking we would get nicer code, i.e. eventually the first option should be the best choice).

dhardy commented 6 years ago

I'm hoping to get the reborrow thing resolved in which case we can use the more concise syntax, but realistically this will take months to get into stable Rust, even if implemented and merged soon.

However, we could still use the <R: RngCore>(self, rng: R) version; it just means users have to explicitly reference until the reborrow thing gets fixed. Perhaps this is the way to go.

As for R: Rng vs. R: RngCore, it appears to make no difference; the compiler even accepts implementations differing from the prototype here (either way around). We must use RngCore within the core crate; I suggest we use Rng for distribution code just to save importing RngCore.

pitdicker commented 6 years ago

Distributions can be used outside of Rand also, and we want them to depend on the 'front-end' Rng trait, not the 'back-end' RngCore?

dhardy commented 6 years ago

I don't think it makes any difference at all. The way I see it you build distributions on top of other distributions as much as possible, thus for many of them there's no need to use the back-end directly at all. I suppose we could use fn sample<R: RngCore>(&self, rng: R) -> T, but use Rng instead for many of the implementations, but it's a bit weird.

dhardy commented 6 years ago

Something else occurs to me: fn foo<R: Rng>(rng: R) allows an RNG to be consumed. I thought this wasn't a problem because you have to write foo(&mut rng) either way — but there is a problem, since within foo we have no proof that R is a reference type, so calling something like my_range.sample(rng); within foo cannot in general use reborrowing (if the compiler were fixed to do so automatically), thus foo will not compile without an extra reference: my_range.sample(&mut foo);.

Annoying though this is, I think it means we cannot use sample<R: Rng>(&self, rng: R) -> T since sample impls commonly use the RNG multiple times. The same is true for any other generic consumption of an RNG, except possibly where it is only used once (e.g. from_rng), though it's probably better to use a single pattern throughout.

burdges commented 6 years ago

Is there a problem with doing fn sample<R: Rng>(&self, rng: &mut R) -> T everywhere? I'd think an extra &mut should be removed by optimization, no?

dhardy commented 6 years ago

But why do that? I think now the best option is fn sample<R: Rng+?Sized>(&self, rng: &mut R) -> T (or RngCore + ?Sized). Sure, it's a little more to write, but it applies directly to r: &mut Rng (doesn't require user to double reference) and supports correct reborrowing today.

Given appropriate language support I might advocate fn sample <R: Rng + Reborrow>(&self, rng: R) -> T, but the only actual advantage is that it would support user-reference types in addition to &mut R.

dhardy commented 6 years ago

I suppose an argument for pursuing the latter is that ThreadRng could eventually become &mut ReseedingRng<..> internally, thus things like distr.sample(thread_rng()) could be fully optimised, but it feels like we're going way out on a limb if we implement (as close as we can get to) this approach now.

Adapting the method signature to that later should have no effect at call sites, but would require all impls of Distribution & similar function signatures to be updated.

burdges commented 6 years ago

We need &mut R : Rng when R : Rng anyways, so we do not break user code if we go from fn sample<R: Rng>(&self, rng: &mut R) -> T to fn sample <R: Rng + Reborrow>(&self, rng: R) -> T in future. We might break PRNGs depending upon how Reborrow works, but that's acceptable.

dhardy commented 6 years ago

It looks like the sole remaining question here is Rng vs RngCore in sample<R: Rng + ?Sized>(&self, rng: &mut R) -> T.

What I still don't understand is why Rng is object-safe (allows unsized usage); it shouldn't be. Should we use RngCore just because we understand that better (it has no generic methods, therefore is object-safe), or use Rng because it seems to work and requires one less import?

pitdicker commented 6 years ago

~I am starting the like that users implementing distribution::Uniform for their types (to replace Rand) can now use Rng::gen() to do so. We should not have them figure out RngCore in my opinion.~

pitdicker commented 6 years ago

If we make the distributions take an unsized RngCore, would that mean that the are not able to use the methods from Rng, because that trait requires Rng to be sized? (except for the part we don't understand...)

pitdicker commented 6 years ago

Maybe ask around somewhere else, like the rust repro or irc?

dhardy commented 6 years ago

I'm thinking actually maybe the compiler simply proves that Rng == RngCore internally then doesn't care which gets used. Rng: RngCore and every RngCore impls Rng.

The impl rules in rand_core take care of the implementation for unsized types, i.e. the next_u32 etc. RngCore methods are wrapped, then used to implement Rng methods so rng.gen::<u32>() for rng: &mut RngCore probably is implemented via four levels of function calls: Rng::gen → Distribution::sample → RngCore::next_u32 →(dyn) RngCore::next_u32 with the last call using dynamic dispatch and all the others static (or optimised out).

The weird bit is when using Rng instead as the unsized type, I guess what happens is that all of the Rng methods get stripped out (generic methods don't support dynamic dispatch), then because Rng: RngCore the impl methods in rand_core construct a fresh (sized) RngCore type around that, which then implements Rng (same as above). So not really much different.

The bit I still don't get is whether this equivalence is by language design or just happens to be so, and might not easily be reproduced in another hypothetical Rust compiler (I've seen people complaining about the lack of a proper standard for how the typing system works).

dhardy commented 6 years ago

@nikomatsakis could I ask for your thoughts on this? (Basically first post, plus the reborrow version we both decided to leave for now.) See:

RngCore trait: rand_core/lib.rs:140
RngCore impls for &mut R / Box: 344
Rng trait: lib.rs:330 — this was previously trait Rng: RngCore + Sized, but restriction is unnecessary
Distribution::sample: distributions/mod.rs:135

pitdicker commented 6 years ago

What do you think of this test?

    #[test]
    fn test_rng_trait_object_2() {
        use distributions::Normal;

        fn get_normal<R: Rng + ?Sized>(rng: &mut R) -> f64 {
            let normal = Normal::new(2.0, 3.0);
            rng.sample(normal)
        }

        let rng = rng(110);
        let mut r = Box::new(rng) as Box<RngCore>;
        println!("{}", get_normal(&mut r));
    }

Edit: updated

We make the rng into a trait object with Box::new(rng) as Box<RngCore>. So all the layers in-between all know r is Sized, because it is only a trait object with RngCore trait. Only at the final destination, sample in distributions::normal, it starts to use the Rng trait, which extends RngCore.

dhardy commented 6 years ago

rng: R requires that R: Sized, try &mut R instead

pitdicker commented 6 years ago

Updated the comment above...

pitdicker commented 6 years ago

The weird bit is when using Rng instead as the unsized type, I guess what happens is that all of the Rng methods get stripped out (generic methods don't support dynamic dispatch), then because Rng: RngCore the impl methods in rand_core construct a fresh (sized) RngCore type around that, which then implements Rng (same as above). So not really much different.

I think you are mostly right, but nothing magical is happening. By using Box<RngCore> it is us saying this is just the RngCore methods. At any point in some function in the chain it can be extended to also have the traits from Rng, but it knows that in the basis it is RngCore.

burdges commented 6 years ago

Right now Rng is object safe because no methods use Self. All methods use only &self or &mut self.

If you wanted a method to take Self then you might survive with where Self: Sized on that method, according to https://doc.rust-lang.org/book/first-edition/trait-objects.html I expect doing this simply prevents the method from being available on the trait object because even if Self: Sized the trait object sees Self: !Sized.

I've do not quite understand trait objects with methods having polymorphic arguments, like

pub trait Distribution<T> {
    fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> T;
}

or

pub trait Rng: RngCore {
    ...
    fn sample<T, D: Distribution<T>>(&mut self, distr: D) -> T {
        distr.sample(self)
    }
    ...
}

At first blush, I suppose a Distribution<T> vtable must contain monomorpized functions for all R: Rng used as a trait object the program, while a Rng vtable must contain monomorpized functions for all T and all D: Distribution<T> used. In principle, rustc could might be clever enough to type erase the D or R respectively, give both traits are object safe. In future, if we get unsized return values then maybe even T could be type erased under the hood.

I doubt these hypothetical type erasures would improve execution speed here, or ever, because the vtable gets built at compile time, but they should reduce code size dramatically because the vtable for Distribution<T> contains many unused functions. Also, I'd suspect using a Rng trait object anywhere in the rand crate itself might expand the Distribution<T> vtable!

I do not know if trait object methods can improve code size here, or maybe they make code size worse, and they only function to replace non-existent where Self: Sized methods.

impl<T> Distribution<T> {
    fn sample(&self, rng: &mut Rng) -> T { ??? }
}
impl Rng {
    fn sample<T>(&mut self, distr: &Distribution<T>) -> T {
        distr.sample(self)
    }
}

We could maybe avoid expanding the Rng vtable by replacing Rng::sample with an inherent method defined by the trait. I donno if those work quite right, like maybe their invisible outside rand, or maybe they do not exist for the trait object, but maybe a trait object method fixes that.

impl<R: Rng> R {
    pub fn sample<T, D: Distribution<T>>(&mut self, distr: D) -> T {
        distr.sample(self)
    }
}
impl Rng {
    pub fn sample<T>(&mut self, distr: &Distribution<T>) -> T {
        distr.sample(self)
    }
}

dhardy commented 6 years ago

I don't think the vtable will contain monomorphized functions; I'm pretty sure generic functions are simply not available when Self: !Sized. But remember that we have impl rules in rand_core/src/lib.rs which implement RngCore methods on a Sized type for an unsized type; the Rng methods are then implemented on top of that sized type as I understand it.

burdges commented 6 years ago

I'm way outside my knowledge here but actually all existing R: Rng satisfy R: Sized, so I doubt &mut R: Sized matters. I'm thinking Rng: !Sized because all trait objects are unsized, so even &mut R as a trait object cannot be considered sized. Yes, you may extract the &mut R where R: Rng from the Rng trait object, but afaik only inside a function called through the vtable.

In general, I'd expect vtables necessarily contain monomorphized functions. In particular, Bar satisfies the object safe rules, so its vtable should cover every type the program passes to Bar:foo. Yet, I'd expect the optimizer cannot eliminate all these static boo buffers which each track some distinct F: Foo.

trait Foo {}
trait Bar { 
    fn foo<F: foo>(&self, f: F) -> u64 {
        static mut boo = [0u8; 16];
        manipulate(&mut boo)
    }
}
impl<F> Foo for F {}

In out case, I wonder if a type cannot be promoted to a trait object, so maybe the trait object method suffices to keep the vtable for Rng small.

impl Rng {
    #[inline]
    pub fn sample<T, D: Distribution<T>>(&mut self, distr: D) -> T {
        distr.sample(self)
    }
}

Anyways I think rustc cannot optimize Rng::sample out of the Rng vtable so long as an Rng might implement Rng::sample differently. I'm suggesting tricks to prevent any Rng from doing Rng::sample differently. I doubt this creates much code bloat since Rng::samplemerely flips arguments, but hey.

dhardy commented 6 years ago

Going to close since I'm reasonably happy with the way this works now.

rust-random / rand

Distribution sample prototype and sized-ness of `Rng` #287