when are facet types equal?

carbon-language / carbon-lang

Carbon Language's main repository: documents, design, implementation, and related tools. (NOTE: Carbon Language is experimental; see README)

http://docs.carbon-lang.dev/

Other

32.31k stars 1.48k forks source link

when are facet types equal? #2409

Open zygoloid opened 1 year ago

zygoloid commented 1 year ago

Summary of issue:

As described in the generics design, there is a subtyping relationship between facet types when one has a subset of the constraints of another. But it's not clear when facet types are equal: should two facet types with the same set of constraints and the same lookup behavior be "the same", or merely be two different types that have the same behavior?

Details:

One place where this comes up is with the definition of the type type. For example:

constraint IsThisTheSameAsType {}
class X {
  impl X as ImplicitAs(IsThisTheSameAsType) { ... }
}
var v: ({} as X);

The use of an expression of type X after a : causes us to look for ImplicitAs(type), which raises the question of whether type is the same value as IsThisTheSameAsType, or a different value that has the same behavior.

Any other information that you want to share?

We could use a structural rule that says that two facet types are the same value if they have the same set of constraints and the same lookup behavior. This may be challenging to specify in a way that implementations agree on, for example because the facet type may have an unresolved set of rewrites, or because it would matter whether vacuous constraints like .Self is type are retained or discarded, or because this would require us to precisely define what it means for two facet types to have the same lookup behavior. This is the approach that explorer currently takes, with a fairly arbitrary set of decisions for the implementation questions.

We could use a lexical nominal rule that says that two facet types are the same value if they were created in the same way (naming the same interface, combining constraints with & in the same order, etc). That rule may also be challenging to support, because it would likely require implementations to track the provenance of each facet type.

We could use a more semantic nominal rule that says that each & or where evaluation creates a unique new facet type, as does each constraint or interface declaration. That approach seems to provide both a reasonable level of simplicity for implementations and an easily understandable user model. This would mean that IsThisTheSameAsType is not the same as type, and nor is type & type or type where .Self is type. That's the approach that seems most promising to me.

chandlerc commented 1 year ago

We could use a more semantic nominal rule that says that each & or where evaluation creates a unique new facet type, as does each constraint or interface declaration. That approach seems to provide both a reasonable level of simplicity for implementations and an easily understandable user model. This would mean that IsThisTheSameAsType is not the same as type, and nor is type & type or type where .Self is type. That's the approach that seems most promising to me.

LGTM FWIW... Any concerns @josh11b or others looking at generics?

geoffromer commented 1 year ago

It seems like it would be simplest to say that two facet types are equal if and only if each is a (non-strict) subtype of the other. Is this issue also asking whether to use a structural, lexical-nominal, or semantic-nominal rule for the subtyping relation, or has that question already been settled? Are the specification and implementation challenges different for the subtyping relation than for the equality relation?

josh11b commented 1 year ago

I think it would be useful to consider when this issue will come up.

If something is parameterized by a facet type, how many instantiations do you end up with? If the quality of the resulting binary is our top priority, that would favor "as few as possible", which I think is delivered by structural rules. I can imagine implementation concerns ("Fast and scalable development") might favor other options.
If the user performs an explicit comparison between facet types, I think the "more semantic nominal rule" is a disaster for this case: facet types will frequently not compare equal even if they were written the same (A & B != A & B). Not sure how this would come up, maybe if you wrote something parameterized by a facet type and you wanted to debug something that was only happening in specific instantiations?
The example listed https://github.com/carbon-language/carbon-lang/issues/2409#issue-1452494667 doesn't seem to favor one model over another to me, since I don't expect it to come up in user code, just in the standard library and prelude which will only ever use type.
Other examples?

It seems much more common to need to worry about subsumption / subtyping of facet types than equality, since that comes up when calling one generic function from another. For this reason, https://github.com/carbon-language/carbon-lang/issues/2409#issuecomment-1319375162 makes a lot of sense to me: define the edge case thing in terms of the common thing we are going to care a lot more about. This leads to implementation simplicity and less for users to learn. I think this leads to using a structural rule.

My expectation when writing up the rules for generics was that structural things like named constraints would be compared structurally, because that seems significantly more natural to me. Interfaces create new "atoms" that are nominal. I talked about this in #syntax on 2022-10-27.

chandlerc commented 1 year ago

Just want to check back in on this issue as a bunch of the terminology and other design questions in the space resolved -- are we ready to pin this down?

josh11b commented 1 year ago

Just want to check back in on this issue as a bunch of the terminology and other design questions in the space resolved -- are we ready to pin this down?

Terminology-wise, this is talking about "facet types" in agreement with the definition given in #2360 . From https://github.com/carbon-language/carbon-lang/blob/trunk/proposals/p2360.md#terminology :

A type is a value of type type.

A facet type is a type whose values are some subset of the values of type, determined by a set of constraints.

Interface types and named constraint types are facet types whose constraints are that the interface or named constraint is satisfied by the type.

The values produced by & operations between facet types and by where expressions are facet types, whose set of constraints are determined by the & or where expression.

type is a facet type whose set of constraints is empty.

chandlerc commented 1 year ago

I wonder if https://github.com/carbon-language/carbon-lang/issues/2409#issuecomment-1319375162 provides a way to get the implementation simplicity of the nominal semantic rule without its downsides. Specifically, saying that type and IsThisTheSameAsType could each be a subtype of each other, but not the same facet?

I do think that the implementation challenges raised for the structural approach are concerning. Structural equality rules have been very difficult to come up with for complex structures, and I'm worried about exactly that case here.

One nice thing about the subtyping aspect is that it may make it easy to have it both ways: regardless of whether A & B or B & A are written (or canonicalized in the implementation), it seems easy to ensure that implementations reliably conclude they are each non-strict subtypes of each other.

josh11b commented 1 year ago

Here is an example that reflects my understanding of the question.

interface I { }
interface J { }
interface K(T:! type) { }
fn F[T:! K(I & J)](x: T);
class C {
  impl as K(I & J);
}
var x : C = {};
// structural rule: allowed
// lexical nominal rule: allowed
// semantic nominal rule: forbidden
F(x);

class D {
  impl as K(J & I);
}
var y : D = {};
// structural rule: allowed
// lexical nominal rule: forbidden
// semantic nominal rule: forbidden
F(y);

Is this correct?

josh11b commented 1 year ago

Discussed in open discussion today

zygoloid commented 1 year ago

The conclusion from the open discussion that @josh11b linked to is that we want a structural rule despite the implementation costs. That seems to give the least surprising behavior across the examples we considered.

The rules we preferred are as follows:

A facet type A is a subtype of another facet type B if it has a superset of constraints, after constraint resolution is performed. That is, A is a subtype of B if and only if, with T:! A in scope, we can determine that T impls B.
- Note that we already need to do this check in various places, in the form of asking "can T be passed to a function taking T:! B, given the facts about T that are in scope?"
Two facet types are the same if both are subtypes of each other and they have the same lookup behavior. That is:
- Name lookup into the facet types looks in the same places (they extend the same interfaces).
- All associated constants are rewritten the same way (they have the same rewrite constraints).
- If we support aliases in named constraints, the set of aliases in the two facet types are the same.

This means that:

Aliases are expanded, and we recurse into the definition of a named constraint.
If interface Y extends interface X, then Y is a subtype of X, and Y and X & Y are both subtypes of each other. Moreover, X & Y and Y are the same facet type, because they have the same lookup behavior: because Y extends X, lookups into X & Y and lookups into Y both look in the same places (X and Y).
Rewrite constraints and equality constraints are not distinguished when checking for subtyping. X where .C = i32, X where .C == i32, and X where i32 == .C are all subtypes of each other. But X where .C = i32 is not the same constraint as X where .C == i32, because they have different lookup behavior -- a lookup for C behaves differently in the two cases.
Equality constraints are not compared transitively. X where A == B and B == C is not a subtype of A == B and B == C and C == A, because if A == B and B == C are both in scope, single-step equality does not allow us to conclude that A == C.
Order and repetition of equality constraints are ignored by both subtyping and constraint type equality. X where A == B and A == B, X where A == B and B == A, X where A == B, and X where B == A are all the same, because if any of them is in scope, we can conclude all of the others hold, and they have the same lookup behavior.

We had an open question as to whether we should allow aliases in named constraints, given the complexity they would add to determining equality for facet types, but it doesn't seem necessary to decide that in order to answer the question in this issue.

chandlerc commented 1 year ago

Marking as decided with this comment: https://github.com/carbon-language/carbon-lang/issues/2409#issuecomment-1469048677

zygoloid commented 19 hours ago

I'm reopening this based on recent discussion. Because we support incomplete interface types and, more seriously, incomplete named constraint types, we cannot provide the rule we described in the previous comment in general because we cannot tell whether an incomplete facet type is a subtype or supertype of another facet type -- and we need to be able to tell whether any two given type values are the same.

One option that we might want to consider is to distinguish between nominal facet types (interfaces and named constraints, which may be complete or incomplete) and structural facet types (formed by where or &, never incomplete), and use a nominal type equality rule for the former and a structural type equality rule for the latter. This has some potentially surprising and possibly undesirable effects:

A and A & A are different, but A & A and A & A & A are the same, where A is a nominal facet type.
A named constraint is not indistinguishable from a structural facet type with the same constraints (but this is fundamental -- the named constraint can be incomplete and later completed and such has a nominal identity that the structural facet type lacks).
A & B requires A and B to both be complete. We might prefer to allow this for incomplete constraint types. However, it seems less surprising that A where ... requires A to be complete.
A & type is not the same as A -- the former is a structural facet type and the latter is a nominal one.

Another option might be to use one of the lexical rules described in the first comment of this issue.

josh11b commented 17 hours ago

We then discussed relaxing when facet types would be required to be complete further in order to allow things like A & B and A where .Self impls B in contexts where A and B were allowed even when they were incomplete. The idea is that we would apply a set of canonicalization rules, and two facet types are equal if they canonicalize to the same thing. Since these canonicalization rules would treat complete interfaces and named constraints the same as if they were incomplete, there will be cases where two structurally equal facet types will compare unequal (but realizing they are equal would require looking at definitions of interfaces or named constraints).

For each facet type we can eagerly form a canonical type, which is normalized as much as possible while treating named facet types as opaque. And then separately we lazily form a resolved facet type, as described by the Rewrite constraint resolution section, which requires all of those named facet types to be defined, and expands them to their corresponding structural facet type. And we do the lazy step when processing an impl, or when using a type constrained to implement the facet, or when doing impl lookup (or at the end of the file if we didn't do it before that as that section of the design suggests).