rust-lang / datafrog

A lightweight Datalog engine in Rust
Apache License 2.0
796 stars 43 forks source link

Unify owned and ref values [WIP] #19

Closed frankmcsherry closed 5 years ago

frankmcsherry commented 5 years ago

This PR attempts to unify treefrog implementations to work both with owned and reference value types.

Previously, the signatures of the Leaper trait involved Leaper<'a, Tuple, Val: 'a> and spoke of vectors of &'a Val elements. These references are potentially unnecessarily complicated if Val is simple and supports clone, so we change this to be

/// Methods to support treefrog leapjoin.
pub trait Leaper<Tuple, Val> {
    /// Estimates the number of proposed values.
    fn count(&mut self, prefix: &Tuple) -> usize;
    /// Populates `values` with proposed values.
    fn propose(&mut self, prefix: &Tuple, values: &mut Vec<Val>);
    /// Restricts `values` to proposed values.
    fn intersect(&mut self, prefix: &Tuple, values: &mut Vec<Val>);
}

Where the Val type can be arbitrary. Now, we have multiple implementations of Leaper where appropriate, both

    impl<'a, Key, Val, Tuple, Func> Leaper<Tuple, &'a Val> for ...
    impl<'a, Key, Val, Tuple, Func> Leaper<Tuple, Val> for ... 

The FilterWith and FilterAnti variants had no constraint on their Val type, and so we just removed the requirement that it be a reference and got a more general implementation (yay). Neither looks at the value field, and just makes their decision based on the key parts of the record.

The potential fall-out is that there are now multiple implementations of Leaper, and it may be less obvious which is intended when you casually try to leapjoin_into.

The from_leapjoin method on Variable now has the signature

    pub fn from_leapjoin<SourceTuple: Ord, Val: Ord>(
        &self,
        source: &Variable<SourceTuple>,
        leapers: &mut [&mut dyn Leaper<SourceTuple, Val>],
        logic: impl FnMut(&SourceTuple, Val) -> Tuple,
    )

which allows Val to be &'a Val as appropriate, or Val, and it is an open question whether this will allow anyone to actually use this elegantly.

frankmcsherry commented 5 years ago

Looking at the code a bit more, there is the potential for further clean-up. Specifically, the ExtendAnti implementations are identical for both &'a Val and Val: Clone, as neither proposes a value and instead only seems to require that the type implement Deref<Target=Val>. It might be over-engineering things a bit to fit that it, and the copy-paste second implementation isn't horrible yet. But, fyi.

frankmcsherry commented 5 years ago

Playing around with this a bit, I'm not sure I would recommend it. This doesn't play especially well with the intent of providing treefrog implementations for Variable, as .. the traits one needs to generalize for that implementation conflict a bit with the generality here. I'm sure it is possible to do both (a new generic parameter ValRep for either &'a Val or Val, independent from the actual Val type, which .. ) but it has been pretty gross so far.

Alternately, stick with the "they implement Clone like all integers, and that makes everything easier. It doesn't showcase cool Rust lifetime stuff (actually cool), but it leaves the code in a much clearer state if you want Variable treefrog impls.

nikomatsakis commented 5 years ago

I didn't see any performance difference, testing locally. If I'm not mistaken, the only real difference here in terms of the end-code is that the Val is passed by value, right?

i.e., my "logic" closures change from |&source_tuple, &val| to |&source_tuple, val|?

nikomatsakis commented 5 years ago

I think i'm inclined in this case not to merge, just because the current system uses refs uniformly and that seems ultimately easier to remember.

nikomatsakis commented 5 years ago

Per prior comment, closing (at least for now).