rust-lang / chalk

An implementation and definition of the Rust trait system using a PROLOG-like logic solver
https://rust-lang.github.io/chalk/book/
Other
1.84k stars 182 forks source link

Specialization #9

Open nikomatsakis opened 7 years ago

nikomatsakis commented 7 years ago

Introduce support for specialization between impls. Probably best to try and implement the current RFC before considering variations.

withoutboats commented 7 years ago

@aturon & I talked a bit about this today. We talked about two basic different ways of encoding specialization:

Aaron was in favor of the first, and presented a compelling philosophical argument. But I am just ignorant enough to want to disagree with the idea. :-)

I don't know about the current implementation of rustc, nor much at all implementing logic programming, but because of the properties the language has the second seems like it could have signifcant performance benefits over the first:

In the first, we have to regenerate the facts proven by every impl in every crate in your DAG, because they could need new 'specialization holes.' However, the orphan rules mean that if that weren't necessary, we could literally cache the facts from each crate in its metadata, and load that info without walking their ASTs again.

The algorithm to generate those impls is, itself, inherently n^2ish. Aaron mentioned that we already have to perform an n^2ish coherence check, but that's not precisely true because of the orphan rules. If an impl passes the orphan rules, we know that it doesn't have a coherence issue with any impls outside of this crate. So we only need to check coherence against impls in this crate.

Of course the philosophical arguments Aaron made about the difficulty of determining 'the most specific proof' & the goal of making the internal logic language as simple as possible are also really compelling.

nikomatsakis commented 7 years ago

First, a nit-pick:

If an impl passes the orphan rules, we know that it doesn't have a coherence issue with any impls outside of this crate. So we only need to check coherence against impls in this crate.

This isn't quite true. We still have to check for overlap against the impls in ancestor crates, because of the possibility that one of them defined a blanket impl. For example, we might have:

// crate A
impl<T: Copy> Foo for T { }

// crate B
struct B<T> { }
impl<T: Copy> Foo for B<T> { }

// crate C
struct C { }
impl Foo for B<C> { }

Now crate C has to check for overlap against both the impl in crate A and the impl in crate B. (It may be that we could get away with just checking crate B in this case, actually, but in other cases we would certainly have to consider crate A.)

nikomatsakis commented 7 years ago

Of course the philosophical arguments Aaron made about the difficulty of determining 'the most specific proof' & the goal of making the internal logic language as simple as possible are also really compelling.

I think this is a case where it might (not sure) make sense for chalk and rustc to differ. That is, what we're basically talking about is whether the handling of specialization should move into the logic engine or into the rules themselves. The advantage of the latter is that the logic engine remains simpler, which is definitely a prime goal of chalk, and is a fairly high goal of rustc, just because we're probably going to want to add enough complexity as it is.

I suspect there is a certain amount of "O(n^2)"-ness anyway. For example, the way that rustc currently handles specialization, is that (a) it constructs a hierarchy of impls (which involves comparing each impl against the others, and is probably O(n^2)). Then (b) you test which impls may match and, if two impls match, you check them in this hierarchy. This isn't exactly O(n^2), but it's effectively doing the negative fact check (by first testing the positive facts and ensuring that they are disqualified). In other words, we don't test that !(T = u32), but we do test that T = u32 and, if so, disqualify the impl that applies to any T.

OTOH, what I had hoped for rustc is that we will build up a "decision tree" that help will take some of this into account already. In other words, if we can show that T = u32, we would never even look at the second impl because we'd know that it is better. This sort of suggests that the specialization logic may indeed be moving "into the engine". This decision tree stuff is also kind of a grey area: I can see an advantage to modeling it in chalk, but also an advantage to keeping it out of chalk so that chalk can be used to show that it hasn't changed any results.

In general, the O(n^2) thing doesn't overly concern me because the number of impls is not so high, but even so I wonder if we can structure things in a clearer way somehow using "indirection". For example, instead of generating a fact like !(T = u32), I wonder if we can generate a kind of "not specialized later" condition, and then generate the rules for that. Something like this:

// impl A: `impl<T> Foo for T`
Foo(T) :-
    \+ ImplAOveridden(T),
    Sized(T).

// impl B: `impl Foo for u32`.
Foo(u32) :- 
    \+ ImplBOveridden(T). 

// Because impl B specializes impl A (how do we test this?)
// we generate this fact:
ImplAOveridden(u32).

There is no doubt still an O(n^2) character in generating these facts, but that doesn't really bother me (and I think it's inevitable; somebody is going to have to compare the various rules!). It might just prove a clearer way to organize things though. (It might also obviate the need to support "or" rules directly in the prover.)

aturon commented 7 years ago

@nikomatsakis This may be what you were getting at, but it doesn't seem like the decision tree strategy and the negative fact check strategy are at odds. In other words, there is likely a way to avoid changing the logic engine itself, while quickly guiding it to the correct proof strategy by carefully generating the program clauses.

Can you elaborate a bit on what you had in mind with using "or" rules?

nikomatsakis commented 7 years ago

@aturon

Can you elaborate a bit on what you had in mind with using "or" rules?

I was just saying that, instead of supporting goals like A; B we can always desugar by introducing some fresh query A_OR_B and writing rules like:

A_OR_B :- A.
A_OR_B :- B.

This is kind of silly, of course, but if it so happens that it falls out naturally from the way that we want to do lowering anyway, it may be sufficient.

nikomatsakis commented 7 years ago

This may be what you were getting at, but it doesn't seem like the decision tree strategy and the negative fact check strategy are at odds.

Agreed, although it may be more straightforward to create the decision tree if we understand the prioritization "natively".

withoutboats commented 7 years ago

Our plan for specialization as of now (or, my understanding of a conversation with @aturon):

First, #39 will land giving us the overlap check. I'm going to modify this to, when impls overlap, instead of erroring, follow up with queries to determine if one specializes the other. If there is a specialization relationship, we'll add it to a forest of ImplDatums.

Once that forest has been constructed by the overlap check, that represents the specialization hierarchy of all the impls available. We'll then iterate over those impl datums, mutating them to add a priority ranking, representing their distance from the root of their tree in the forest.

These priority rankings will be used to dispatch associated items the correct impl. Consider these two impls:

trait Foo {
     type Assoc;
}

impl<T> Foo for Vec<T> {
     default type Assoc = bool;
}

impl Foo for Vec<u32> {
    type Assoc = ();
}

The first impl will have priority 0, whereas the second will have priority 1. These impls will correspond to facts like these:

// First impl
Vec<?T>: Foo.
Vec<?T> Foo<Assoc = bool> :- Priority(?T: Foo) >= 0 /\  not { Priority(?T: Foo) >= 1 }.
forall<T> { Priority(Vec<?T>: Foo) >= 0. }

// Second impl
Vec<?T> Foo<Assoc = ()> :- Priority(?T: Foo) >= 1 /\ not { Priority(?T: Foo) >= 2 }.
Priority(Vec<u32> >= 1).

From these facts the solver can dispatch associated items correctly.

One thing that's nice about this system is that impls from upstream crates do not need to have their lowering modified in the presence of new impls. We could literally cache the lowered representation in the .rlib or whatever, and then just load it from disk before lowering the currently compiled crate. However, the logic engine only needs to gain an understanding of a "greater than or equal to" relationship, and not specialization proper.

nikomatsakis commented 7 years ago

@withoutboats

First, #39 will land giving us the overlap check. I'm going to modify this to, when impls overlap, instead of erroring, follow up with queries to determine if one specializes the other. If there is a specialization relationship, we'll add it to a forest of ImplDatums.

One complication here: we're going to have to adjust lowering, since -- in order to do those checks, we have to make sure that we have lowered enough traits to do the queries we need. In the compiler, we recently achieved this by making the lowering of a particular trait (and its impls) a single work item. We then execute the work items on demand, watching out for cycles (which are an error).

I've not yet managed to construct an example where this applies though.

In other words, something like this would be illegal (and indeed it results in a cycle error on rustc today):

use std::fmt::Debug;

trait Foo {
    type Test;
}
trait Bar {

}

impl<T> Foo for T { type Test = (); }
impl<T: Bar> Foo for T { type Test = i32; }

impl<T: Foo<Test = ()>> Bar for T { }

fn main() { }

In order to determine the hierarchy for Foo, we have to evaluate whether T: Bar, which requires checking whether T: Foo<Test = ()>, which requires having the hierarchy for Foo so we can decide which specialization applies.

withoutboats commented 7 years ago

@nikomatsakis Yeah, @aturon and I talked about this a little bit. Our idea was to just not take this into account in the first pass and then fix it later.

The current PR performs all of the specialization checks against the program prior to overlap checking (meaning it will presumably pick arbitrary associated types without respecting specialization). This is bad and it needs to change, but I'm not quite sure how.

nikomatsakis commented 7 years ago

@withoutboats Yeah, I have to think about this -- I am trying to figure out if this implies that lowering cannot be separated cleanly from proving or not. I...well, I suspect it can actually. After all, the only time that the "specialization tree" matters is when you are projecting, right? I suspect we can do something in lowering to detect these cycles. Let me see if I can write something down that makes sense...

withoutboats commented 7 years ago

After all, the only time that the "specialization tree" matters is when you are projecting, right?

Yes AFAIK

jackh726 commented 4 years ago

Closed PR that needs followup: https://github.com/rust-lang/chalk/pull/230