rust-lang / datafrog

A lightweight Datalog engine in Rust
Apache License 2.0
796 stars 43 forks source link

Allow `Relation<Key>` to join with `Variable<(Key, Value)>` directly #36

Closed ecstatic-morse closed 3 years ago

ecstatic-morse commented 3 years ago

Polonius has to create new intermediate variables with types like Variable<(Key, ())> from Relation<Key> so that they can be joined with variables like Variable<(Key, Value)>. This happens once in the naive variant (see below) and twice in the optimized one. This isn't actually necessary, since we can legally transmute a Relation<Key> to a Relation<(Key, ())> according to the unsafe-code guidelines. This saves a bit of memory and some time copying tuples around, although these relations aren't very large on any of the inputs bundled with Polonius. Mostly, it makes the code clearer by removing the number of variables used solely for re-indexing.

To do this safely, I had to change how Relation implements JoinInput. This actually fixes a bug in Relation::from_antijoin (which we don't use), but it's not obviously correct. See the second commit for details.

I changed the signature of JoinInput so its semantics could remain the same. Relation::from_antijoin is still broken, however.

ecstatic-morse commented 3 years ago

Whoops, 25c7774 is actually a terrible idea. Now when a Relation is used as a JoinInput, its tuples are joined with both the stable and the recent tuples of the variable (instead of just the recent ones). Unfortunately, StableTuples cannot (soundly) be transmuted in the necessary fashion. We'll have to change the API to make this work.

lqd commented 3 years ago

This looks good to me. And with it, https://github.com/rust-lang/polonius/pull/178 should actually improve on clap now, right ?

I do wonder what to do about Relation::from_antijoin. Should we remove it until it's fixed ? We can at the very least create an issue describing the problem so that we don't forget.

Thankfully these Relation helpers are not really important compared to doing the actual joins on the Variables.

ecstatic-morse commented 3 years ago

Yep! I'm glad I tested that PR first XD.

I think we can just refactor the helper function so it works on a Relation instead of a JoinInput. I'll open an issue with mentoring instructions, since this isn't too urgent.

lqd commented 3 years ago

@ecstatic-morse btw do you know whether miri complained with the previous transmute ?

ecstatic-morse commented 3 years ago

It won't complain on either of them. From the Miri docs:

If the program relies on unspecified details of how data is laid out, it will still run fine in Miri -- but might break (including causing UB) on different compiler versions or different platforms.

&[T] and &[(T, ())] are not AFAIK guaranteed by the UCG to have the same layout, even though T and (T, ()) are. However it's very unlikely that we would change the order or padding of the pointer/length pair, so they happen to be layout-compatible in practice and will probably remain as such.

lqd commented 3 years ago

Let's merge this, then.