rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.96k stars 12.68k forks source link

incr.comp.: Clarify compatibility of red/green evaluation and recovering from cycle errors #42633

Closed michaelwoerister closed 7 years ago

michaelwoerister commented 7 years ago

I originally planned to describe the differences between the current query system and the future version that supports red/green dependency tracking but while thinking about it, there's not too much of a difference when it comes to cycle errors:

The main difference I see between the current query system and the red/green one is dependency tracking (and there the current system behaves a bit strangely, actually): When a query is started, a new DepNode is allocated and reads to that DepNode are registered as they occur. E.g., we let's say we have the following "query trace":

Queries execute in the order of their label:

            1 <------------+
            |              |
    +-------+-------+      |
    |       |       |      |
    v       v       v      |
    2       4   +-- 5 --+  |
    |           |       |  |
    v           v       v  |
    3           6       7--+

The dependency graph would look exactly like that, cycle and all.

In the red/green system, a DepNode would only be allocated after the successful evaluation of a query. Consequently, we would not end up with the dep-graph shown above. But we still want to record dependencies to those sub-queries because their result influenced what 1 evaluated to after recovering from the cycle. I propose to just merge all reads into the parent query when encountering a cycle error, effectively merging all reads into the root of the cycle. The dep-graph from above would then look like this:

            1
            | 
    +-------+-------+
    |       |       |
    v       v       v
    2       4       6
    |
    v
    3

There are no DepNodes for 5 and 7 (they never completed) but we don't lose the information that 6 has been read, which is important, because the result of 6 might also have caused a trace without cycle.

I still think that recovering from cycle errors is dubious but at least the situation should not get worse with the new system.

Comments? @eddyb @nikomatsakis @rust-lang/compiler

eddyb commented 7 years ago

If a query A starts a sub-query B and B starts a sub-sub-queryC, which would invoke A again, then we would have to make sure that neither the results of B or C are cached. But since we are still in the middle of computing B and C, we don't have any result to cache anyway.

Yes, but on the way back up to A, both B and C get cached today and they saw the cycle.

michaelwoerister commented 7 years ago

@eddyb Oh, an Err(CycleError) seems to get cached, yeah. We could just... not do that, right?

eddyb commented 7 years ago

@michaelwoerister The actual error is uninteresting, but B and C may change based on it.

michaelwoerister commented 7 years ago

Looking at the code again, no Err(CycleError) actually seems to be cached. The semantics of this are so subtle that I keep wanting to get rid of it. Imagine having to explain this to a potential contributor :(

eddyb commented 7 years ago

@michaelwoerister AFAIK we'd regress in a few places if we removed cycle recovery. I'd rather have it be enforced correct-by-default and have a warning that if you use try_get your queries may be triggered more than once, so it's a performance/side-effect hazard, but nothing worse.

nikomatsakis commented 7 years ago

Sorry I've been slow to weigh in on this thread. I've been thinking about cycles a lot. At this point, my opinion is that -- at least for now -- we should remove cycle recovery from the query system (that is, cycles should always be considered fatal to compilation -- they don't necessarily have to panic). Things that legitimately want to recover from cycles (which I think is basically just the trait system) should handle it themselves for now.

We don't actually use cycle recovery that much right now, and I think all of the cases can make do with some form of hard error (after a bit of refactoring). We can survey them quite quickly.

Unless I missed something, those are all the uses of ::try_get in the code-base, and hence (I think) all the uses of cycle recovery that I am aware of.

I was thinking of making trait selection into a query, but I've since thought better of it. I think it will be something "query-like" -- in particular, it will be a function that starts with just a tcx and a key -- but it will want to manage its own caches and it will also need to manage its own dependencies (using anonymous nodes). There is a lot of stuff very particular to the trait system that needs to happen (e.g., for a cycle, we need to check all the traits involved in the cycle and check whether they are auto traits, etc), and I don't see that generalizing very well. Also, I think that ultimately trait matching will not want to be a "DFS" sort of procedure, but rather something more like the ObligationForest, where we build up a dependency graph and kind of "iterate over it" until we are done.

Ultimately my view is that we should aim to keep the query system fairly simple, at least for now. Maybe eventually we will want to expand it into a full-on attribute grammar system that can handle all sorts of data propagation or whatever, but I think that the current setup -- where a query is basically just a memoized function call -- seems about right.

michaelwoerister commented 7 years ago

As a side-note regarding inlining: With https://github.com/rust-lang/rust/pull/43183, the trans-collector builds up a fairly accurate, global "call graph" between translation items. This could be useful for inlining too. It is post-monomorphization though, which is good for the quality of information it can provide, but which also means that it is a bit late in the pipeline.