Infinite recursion during impl selection

carbon-language / carbon-lang

Carbon Language's main repository: documents, design, implementation, and related tools. (NOTE: Carbon Language is experimental; see README)

http://docs.carbon-lang.dev/

Other

32.28k stars 1.48k forks source link

Infinite recursion during impl selection #2458

Closed geoffromer closed 1 year ago

geoffromer commented 1 year ago

Description of the bug:

Circular impl declarations can cause Explorer to recurse infinitely, leading to a stack overflow.

What did you do, or what's a simple way to reproduce the bug?

Run this program in Explorer:

package Foo api;
interface Foo {}
impl forall [T:! Foo] T as Foo {}
fn F[T:! Foo](x: T) {}
fn Main() -> i32 { F(0); return 0; }

What did you expect to happen?

I would expect Explorer to diagnose the circularity of the blanket impl in some way, during the typechecking of F. Alternatively, it would be surprising but self-consistent for Explorer to decide that i32 implements Foo, and therefore run the program and return 0, or for it to decide that i32 does not implement Foo, and therefore diagnose the call F(0) as a type error.

What actually happened?

Explorer crashes due to a stack overflow.

Any other information, logs, or outputs that you want to share?

No response

zygoloid commented 1 year ago

@josh11b I recall we intended to have a restriction that an impl can only have constraints that are simpler in some sense, to ensure termination. Do we have a concrete rule documented somewhere that we could implement in explorer?

josh11b commented 1 year ago

@zygoloid We talked about this on 2022-04-13. It is on my short list to write this up as a proposal, but there are still a lot of specifics to nail down. I think there is room to do something easier to implement to start, with the option to increase what is accepted as the need is demonstrated.

A possible rule:

It is an error for an impl declaration to transitively depend on itself in a way that either "repeats" or is "more complex"
Repeating is defined as using the same types.
More complex is defined as:
- having the same root types for every parameter of the impl declaration, and
- some parameter having greater depth in its type tree
Some statement that non-type parameters either are non-increasing in absolute value (for integer types), have a fixed set of possible values (for enums), or don't change in value (everything else).

The argument that this would terminate rests on there being:

a finite number of impl declarations
a finite number of root types
therefore a finite number of ways to parameterize a specific impl declaration with root types
each root type having a fixed number of parameters
therefore there are a finite number of types with the same root type and a specified max depth in its type tree

When we add support for variadics, we would have to also specify a non-increasing number of parameters in variadics (or possibly (root-type-with-variadic-parameters, number) pairs).

What do you think?

zygoloid commented 1 year ago

A couple of potential issues strike me with this overall strategy, without getting into the details of the "more complex" rule:

These would be checks that we do as part of impl selection (we can't do them by just looking at the impl declaration in isolation without considering a query), and thus a failure here can cause impl selection and hence monomorphization to fail.
If we have multiple candidate impls, and any one of them hits a cycle or apparent non-termination, then we reject the program, even if another one of them works and doesn't hit a cycle. This seems like it would be necessary to ensure coherence given that the "more complex" check will have false-positives when detecting non-termination. (The scenario I'm thinking of is that when looking for impl A as B, we might succeed when performing the query at the top level but fail when performing the query from a context where another query is already in progress. If this led us to succeed but pick a different impl in the latter case, that would result in incoherence.)

I think we need to try this kind of approach and see how well it works in practice. The combination of those two concerns seems to mean that adding a "broken" impl can cause monomorphization failures in distant code that previously worked, and for which that impl was actually not applicable at all. I think I'd be more confident if we had guidelines of the form "write your impls like this and you will never hit the non-termination error cases" so that people could be confident they're not setting a trap for themselves (but I suppose if we had such guidelines and were confident they weren't too restrictive, we could use them as our rules).

Another potential issue with this kind of approach is that while it ensures eventual termination, the runtime doesn't have a reasonable upper bound. Again I think we'll need to see how this plays out in practice -- if the very long runtime case needs a contrived input, I think I'm OK with that, but if it happens on superficially reasonable code then that would be a concern.

For the detailed "more complex" rules -- the concrete partial order we're putting on queries above -- the "no increase in tree depth" rule would disallow things like impl UniquePtr(T) as Printable depending on impl Nullable(T*) as Printable -- "simpler" types may have a greater tree depth. And it'd mean that redefining a class String as an alias to BasicString(char, CharTraits(char)) might invalidate working code by increasing the tree depth. That might be a concern that we should look into addressing somehow. Perhaps we can define a "more complex" partial order in a similar way to the way we order impls -- for example, we could say that if library A imports B, then types defined in A are more complex than those defined in A for the purpose of impl selection partial ordering, and that types defined in the same library are ordered in lexical definition order -- but it's not obvious to me how to extend that to cover type trees in a way that lets us treat Nullable(T*) as simpler than UniquePtr(T) in general (assuming that Nullable is simpler than UniquePtr).

josh11b commented 1 year ago

@zygoloid Absolutely this is a check that is done during impl selection and can lead to monomorphization failure, that matches what we said in the previous discussion. I don't think the problems in general can be detected locally.

If we have multiple candidates, I agree we need a coherent way of choosing an answer. It isn't obvious to me that means doing work that wouldn't have to be done otherwise. Is it possible for two different modules to see a different prefix of candidates when they are evaluated in decreasing priority order? My expectation is that our rule for coherence mean that the set of candidates can be pruned to a consistent and guaranteed sufficient set across modules, and that can be used to exclude irrelevant far-away impls.

I agree this produces a very large upper bound, but part of the goal was to avoid an artificial recursion limit which users experience in practice, creating a cliff.

My "more complex" rule definitely allows impl UniquePtr(T) as Printable depending on impl Nullable(T*) as Printable. Either those are different impl declarations, or they are part of a single blanket implementation (impl [U:! ...] U as Printable) with different root types for the parameter (U), and tree depth is only considered when the root types are the same.

I do agree that my rule can make code stop compiling after redefining a class String as an alias to BasicString(char, CharTraits(char)), though it would only make a difference when the String was not in a root type position in a rule. Personally I'd be interested to see if and how it causes trouble in practice before trying to address the issue.

zygoloid commented 1 year ago

My "more complex" rule definitely allows impl UniquePtr(T) as Printable depending on impl Nullable(T*) as Printable. Either those are different impl declarations, or they are part of a single blanket implementation (impl [U:! ...] U as Printable) with different root types for the parameter (U), and tree depth is only considered when the root types are the same.

Hm, I agree. The issue I was thinking of does seem like it could arise in more complex situations, though. Consider impl Vector(String) as Dump depending on impl Vector(CharAdapter(i8)) as Dump (because String is internally implemented by using Vector(CharAdapter(i8))), and both of those being handled by impl forall [T:! Dump] Vector(T) as Dump. This seems like it's going to come up in a lot of the use cases we have for Data, for example.

The intuition I have here is that tree depth is not a good approximation for what are reasonable transitions from "more complex" to "simpler" types that can arise through composition, and instead if U is implemented in terms of T it should be possible for an impl lookup involving U to perform an impl lookup involving T, even when T has a higher tree depth. Tree depth does capture the transition from Parameterized(T) to T, but not from T to FieldInsideT; perhaps we can find a rule that handles both cases somehow. I also think it's important that implementation details such as whether a type is an alias to a parameterized type or whether it is defined as a class are not visible outside the library (or at least package) defining the type, which makes me nervous about using tree depth.

That said, I think it'd be fine to go ahead and implement this rule anyway, as an exploratory measure. But I don't think we should get too attached to it and should keep looking for a better "more complex" rule to replace it.

josh11b commented 1 year ago

@zygoloid Here is an idea, though it is not fully formed: what if we look at how a base type is parameterized in practice rather than how it is declared? That way if CharAdapter(...) is only ever CharAdapter(i8) in a particular context, it is considered equally complex as an unparameterized type.

josh11b commented 1 year ago

After some in-person discussion, we ended up with this idea:

A type expression is "more complex" than another type expression if the multiset of base types of the latter is dominated by the multiset of base types of the former. This means that to be more complex, the set of base types must be a superset (or equal), and the counts of how often base types appear is equal or greater in every case, and in at least one case the count is greater.

Let's call a (finite or infinite) sequence of type expressions "good" if no later element is more complex than an earlier element. @zygoloid has a proof that any good sequence of type expressions with a finite set of base types is finite. I can sketch it, but it might be better if I leave the statement of the proof to him.

josh11b commented 1 year ago

My sketch of @zygoloid 's proof:

It is a proof by contradiction.

If there is an infinite good sequence, then it has an infinite subsequence where the count of some base type strictly increases. Otherwise the count of every base type is bounded and we have a finite sequence or a repetition of some type.
Looking at this infinite subsequence, removing the base type that strictly increases leaves us with another infinite good sequence. This is because a base type whose count increases never helps avoid the type expressions becoming more complex.
We can then repeat this procedure, giving an infinite good subsequence with one fewer base type each time, until we have removed every base type but one. At this point the sequence must have decreasing counts for the remaining base type, and so must be finite. A contradiction.

josh11b commented 1 year ago

Note: the above includes an assumption that any sequence repeating a type is rejected. Observe we could achieve that by requiring that no multiset of base types repeats, but that isn't necessary since there are only a finite number of types with a given multiset of base types. Proof: If none of the base types have a variadic parameter list, then there is at most one type for every distinct permutation of base types. If some base types are variadic, then we can get a conservative finite upper bound by multiplying the number of distinct permutations by the number of different possible arity combinations. The number of arity combinations is finite since, ignoring non-type arguments, the total arity must equal the number of base types in the type minus 1.

josh11b commented 1 year ago

I came up with a constructive proof. It proceeds by induction on the number N of distinct base types. By the previous comment, we can restrict to good sequences that don't repeat any multiset of base types.

If N == 1, then types map to a multiset of a single base type, which can be represented by the count of the number of times that base type appears. That number must be non-negative and decreasing in the sequence, and so the length of the sequence is bounded by the value of the first element. So good sequences with N == 1 must be finite.
Assuming that good sequences with N distinct base types must be finite, consider a good sequence with N+1 distinct base types. Its first element will be represented by a non-negative integer (N+1)-tuple, (i_0, i_1, ..., i_N). Every element after that will either be in one of the i_0 + i_1 + ... + i_N hyperplanes (co-dimension 1) given by these equations:
- x_0 = 0, x_0 = 1, ..., x_0 = i_0 - 1 (i_0 different equations, each defining a separate hyperplane)
- x_1 = 0, x_1 = 1, ..., x_1 = i_1 - 1
- ...
- x_N = 0, x_N = 1, ..., x_N = i_N - 1
Any point not in one of those hyperplanes has components all >= the first element, and so can't be in the sequence if it is good.
The restriction of the sequence to the subsequence in each of those hyperplanes is finite, by the induction hypothesis.
The sequence visits points in this finite union of finite sets without repetition, and so must be finite.
Conclusion: Any good sequence with N+1 distinct base types is finite, completing the induction.

This bound given by this construction is not completely tight, since there is overlap between the hyperplanes. It is tight once that overlap is taken into account, though. We can construct sequences that reach the upper bound by visiting the points in the union of the hyperplanes in descending order of their L1-norm (sum of the components).

zygoloid commented 1 year ago

@josh11b How should match_first interact with this rule? For example:

interface Foo {}

match_first {
  impl i32* as Foo {}
  impl forall [T:! type where .Self* is Foo] T as Foo {}
}

fn F[T:! Foo](x: T) {}

fn G() {
  // OK?
  F(0);
}

If the match_first causes us to not even consider later impls in the same block, then this is presumably valid. But if match_first is considered after we find a set of candidate impls, when we're trying to pick one (as in the current explorer implementation), then this will be rejected.

josh11b commented 1 year ago

@josh11b How should match_first interact with this rule? For example:
interface Foo {}

match_first {
  impl i32* as Foo {}
  impl forall [T:! type where .Self* is Foo] T as Foo {}
}

fn F[T:! Foo](x: T) {}

fn G() {
  // OK?
  F(0);
}
If the match_first causes us to not even consider later impls in the same block, then this is presumably valid. But if match_first is considered after we find a set of candidate impls, when we're trying to pick one (as in the current explorer implementation), then this will be rejected.

I'm fine with either option. I'm inclined to favor implementation simplicity and compile speed, which I guess means the first option?

josh11b commented 1 year ago

@chandlerc points out that this would allow users use match_first to manually break cycles.

zygoloid commented 1 year ago

Fixed in explorer in 0e41c56. We'll also need a design proposal.