carbon-language / carbon-lang

Carbon Language's main repository: documents, design, implementation, and related tools. (NOTE: Carbon Language is experimental; see README)
http://docs.carbon-lang.dev/
Other
32.34k stars 1.48k forks source link

when are symbolic values resolved under the one-step equality rule? #1384

Open zygoloid opened 2 years ago

zygoloid commented 2 years ago

Consider:

interface Foo {
  fn FooFunc[me: Self]();
}
external impl i32 as Foo;
interface Constraint {
  let T:! Foo;
  let N:! i32;
  fn MakeOne[me: Self]() -> T;
  fn MakeArray[me: Self]() -> Array(T, N);
}
fn F[X:! Constraint where .T == i32 and .N == 2](x: X) {
  let a: Array(X.T, X.N) = x.Make();
  a[0].FooFunc();
  let b: Array(X.T, X.N + 1) = (x.MakeOne(), x.MakeOne(), x.MakeOne());
  b[0].FooFunc();
}

Generally when we see an expression like X.T or X.N, we want to preserve it symbolically, even if we can resolve it to a constant -- X.T and i32 are different symbolic type values, even though we know they refer to the same type. For example, the type of a[0] in F should be X.T, not i32, so that a[0].FooFunc finds the member of interface Foo.

But what should happen when we see the declaration of b, and in particular X.N + 1? Do we at some point switch from viewing X.N as a symbolic reference to the associated constant N and replace it with the corresponding known value of 2, and if so, when does that happen? Is the type of b exactly equivalent to Array(X.T, 3) so that the initializer type-checks, or is it simply an error to attempt to perform an addition on the symbolic value X.N?

zygoloid commented 2 years ago

One possible answer is that the + operator expects a : (runtime) operand, and is given a :! (generic) operand, resulting in a conversion analogous to an lvalue-to-rvalue conversion, which in this case produces an actual value from a symbolic operand. This would presuppose that we have distinct value categories for generic / symbolic values and runtime values, as in this discord discussion.

This may only be part of an answer, however: if we have class Array(T:! Type, Size:! i32) then presumably we cannot pass X.N + 1 as an argument for parameter Size, because X.N + 1 is a : value not a :! value. And the same would seem to be true even if Size is a template parameter.

josh11b commented 2 years ago

I do want the code in the example to be legal.

How much can we keep track of the value and type separately? Treating X.N like its value 2 seems to be what we want here, the only problem is X.T has additional places to do name lookup than i32 because of its type (of type). It seems like if among the equalities we have for a symbol, one of them is a constant known at type checking time, we should treat it more like a template and actually use the value. I further feel like generic values decay to dynamic values when you perform operations on them, but template-like values produce more template-like values.

x + y
x: T y: T dynamic
x: T y:! T dynamic
x: T template y:! T dynamic
x:! T y:! T dynamic
x:! T template y:! T dynamic
template x:! T template y:! T template
zygoloid commented 2 years ago

It's not clear to me that we want to treat X.N like its value in general. Given:

fn F(X:! Constraint where .N == 2, Y:! Constraint where .N == 2) {
  var a: Array(X.T, X.N) = ...;
  var b: Array(X.T, Y.N) = a;
}

... I think we previously had said we wanted a type-checking failure, because X.N and Y.N are not single-step equal, and that this could be resolved with something like observe X.N == 2 == Y.N;. We had at least said something similar for the corresponding case where a type is changing rather than a non-type value:

fn F(X:! Constraint where .T  == i32, Y:! Constraint where .T == i32) {
  var a: Array(X.T, X.N) = ...;
  var b: Array(Y.T, X.N) = a;
}
mconst commented 2 years ago

(In your second example, was the first Y.T supposed to be X.T?)

I think it's okay to require observe in some or all of these cases if that makes things easier for us, but requiring observe doesn't feel desirable to me. If we can make this code just work, that seems more intuitive and more convenient! I certainly wouldn't want to add complexity in order to reject code like this.

josh11b commented 2 years ago

This is definitely intertwined with the question from #1369 . When you say "we previously had said we wanted a type-checking failure, because X.N and Y.N are not single-step equal", I think the situation could more accurately be described as: we are performing an experiment with the current design and seeing if the results are desirable, and so we are trying "single-step equal" which has the consequence that code has a type-checking failure, but not that this specific result was desirable.

The observe experiment is aiming for a simple rule to achieve these goals:

But it is an experiment since there are definite downsides that may outweigh the benefits. In particular, observe declarations themselves add code detracting from readability, writability, and compilation speed, so we really want them to be rare. They are particularly bad if which declarations are needed is difficult to predict in advance (have to round trip through a compile cycle), and if they only answer a question for the compiler not the reader.

I feel like your recent work has been calling into question whether this rule is easy to predict and has been coming up with a variety of situations where observe declarations are needed, raising concerns that they are going to be too frequent. So I think it is definitely worth considering alternatives to the current "one-step equality" rule, and considering rules that make your example compile is a part of that.

zygoloid commented 2 years ago

So I think it is definitely worth considering alternatives to the current "one-step equality" rule, and considering rules that make your example compile is a part of that.

That makes sense to me, but that's going beyond what I want to ask in the context of this question, and I think is a significant part of what #1369 is exploring. The context that I'd intended for this question is: assuming that we stick with the one-step equality rule, how do we answer this question? I've updated the summary of this issue to reflect that.

If the answer is that we don't have a good answer, that's definitely strongly pushing towards a different set of rules, but I'd like to explore whether we can make the current ruleset work before we try something else that will very likely present a different set of tricky questions -- that is, I want to complete the one-step equality experiment so we have a concrete comparison point, and this question seems to be on the critical path to that goal.

How much can we keep track of the value and type separately? Treating X.N like its value 2 seems to be what we want here, the only problem is X.T has additional places to do name lookup than i32 because of its type (of type).

I think we might want to somehow track the type, the symbolic value, and perhaps the concrete value separately. The fact that X.N or X.T was written as X.N / X.T is not part of its type, and the one-step equality rule requires that we track how such an expression was written. So we must preserve X.N as the symbolic value X.N when evaluating the Array(X.T, X.N) expression, somehow, whether in the value result of the evaluation of X.N or as some kind of side-channel that behaves like it's the value for symbolic purposes.

Some options to consider:

github-actions[bot] commented 2 years ago

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please comment or remove the inactive label. The long term label can also be added for issues which are expected to take time. This issue is labeled inactive because the last activity was over 90 days ago.

josh11b commented 1 year ago

@zygoloid #2173 changes the calculus here: We expect to use = instead of == with concrete type values, and so we don't need to worry about the ergonomics of that case. Using == instead makes it clear that we are not going to try and resolve it to a specific value. Does that give a simple answer to this question?

zygoloid commented 1 year ago

I think that gives us the possibility of having a simple answer: we can refuse to perform any calculation on a symbolic value such as X.N that is only single-step-equal to a constant, rather than being rewritten to a constant. I don't know whether that will be sufficient for the use cases that we will want to model or not.

A somewhat related case is:

interface Sized { let Size:! i32; }
impl i32 as Sized where .Size = 4 {}
fn GetSize(T:! Sized) -> i32 { return T.Size + 1; }
class C(template N:! i32) {}
var v: C(GetSize(i32));

Does that last line compile? (Does the third line even compile?)

I guess the question here is, if we have a concrete impl in scope, can we perform member access into that impl in a compile-time computation? I think that probably has to work. But then, what about:

fn Generic[T:! Sized where .Size = 4]() {
  var v: C(GetSize(T));
}

No rewrite happens here, because we have no member access naming T.Size in a scope where the rewrite is visible. Does the compile-time call to GetSize fail, or does it succeed by looking into the constraint Sized where .Size = 4? Because the parameter of C is a template parameter and T is a generic parameter, I think we need to know what GetSize(T) returns before we know the value of T in order for this to compile.

Some options: 1) A symbolic expression such as T.Size can only be evaluated at runtime. At compile time, if it's not rewritten to a constant, then it's not a constant and can't be used in a situation where a template constant or compile-time value is needed. 2) A symbolic expression such as T.Size always means "look up Size in the witness table. This can always be done at runtime, but can only be done during compilation if a concrete witness table is known. So the first example is accepted, because we will build and pass into the call to GetSize a witness table for i32 as Sized. But the second example is rejected, because we only have a symbolic witness for T as Sized. 3) A symbolic expression such as T.Size, when evaluated with a symbolic witness, asks the witness for the value of the constant as specified in a rewrite constraint. That would accept Generic above, but not the equivalent example with a == constraint. 4) A symbolic expression such as T.Size, when evaluated with a symbolic witness, asks the witness for the value of the constant as specified in a rewrite or equality constraint. We only look through one step of equality constraints. If we don't reduce to a concrete value, then the expression is non-constant. 5) A symbolic expression such as T.Size, when evaluated with a symbolic witness, asks the witness for the value of the constant as specified in a rewrite or equality constraint. Equality constraints are recursively searched until we find a constant value. If we don't find a constant value, the expression is non-constant.

Option (5) is probably easy to eliminate because I don't think it's computable in full generality, for the usual reasons that transitive equality is hard. Option (1) seems substantially too restrictive to me, and option (4) seems to be granting == constraints a little too much strength. So I think we should go with (2) or (3) (or something not listed above), and I think I'm inclined to side with (2), at least until we see evidence that it doesn't work well enough in practice. We can move to (3) or even (4) later if we find a need.

It seems like there are two variants to consider:

a) Any compile-time evaluation of T.Size always performs a witness lookup. b) Only compile-time evaluations that need an actual value perform a witness lookup; any other evaluation preserves the symbolic value.

Option (b) seems appealing, but still has some oddness:

class CTemplate(template N:! i32) {}
class CGeneric(N:! i32) {}
fn GetSize(T:! Sized) -> i32  { return T.Size; }
fn Generic[T:! Sized where .Size = 4]() {
  // OK, rewritten to CTemplate(4)
  var v1: CTemplate(T.Size);
  // OK, rewritten to CGeneric(4)
  var v2: CTemplate(T.Size);
  // error, symbolic argument `T.Size` given to template parameter,
  // don't have a witness so can't convert to a concrete value
  var v3: CTemplate(GetSize(T));
  // OK, but type is `CGeneric(T.Size)`, not `CGeneric(4)`
  var v4: CGeneric(GetSize(T));
  // error, symbolic argument `T.Size` in operand of `+`.
  var v5: CGeneric(GetSize(T) + 1);
  // OK via single-step conversion, but this is fragile and
  // more distant conversions can fail by requiring more than
  // one step
  v2 = v4;
}

I wonder if we can get away with (2a). That seems like the cleanest and simplest model, but it's restrictive. Alternatively, we could try also performing rewrites on any symbolic values produced by constant evaluation, but that would still reject v5.

We should also look at examples of type expressions, where I think there will be strong pressure for something like (b) to work.

I think we need some more discussion and exploration of options here.

josh11b commented 1 year ago
interface Sized { let Size:! i32; }
impl i32 as Sized where .Size = 4 {}
fn GetSize(T:! Sized) -> i32 { return T.Size + 1; }
class C(template N:! i32) {}
var v: C(GetSize(i32));

Does that last line compile? (Does the third line even compile?)

It seems to me that the third line compiles, I'm not even sure what the question is. T.Size has type i32 and so + 1 is allowed, but may overflow at runtime. The last line depends on #2153 . According to my understanding of our most likely resolution of #2153 , the C(GetSize(i32)) in the last line type checks since GetSize returns an i32 which matches the type of N, and triggers the evaluation of GetSize(i32) at type-checking time in order to evaluate the value of a template parameter. This evaluation looks to me like it should succeed, since the body of GetSize is visible at the call site, and the compile-time execution of that body relies on impl lookup for a concrete type, which seems like something we have to allow (and our coherence rules should make possible and consistent), and does not result in overflow.

fn Generic[T:! Sized where .Size = 4]() {
  var v: C(GetSize(T));
}

This again depends on the resolution of #2153 . My understanding of our resolution of #2153 is that the checked-to-template conversion that happens when instantiating C makes C(GetSize(T)) template dependent (or at least template validity dependent) and only fully type checked when the value of T is known.

Does the compile-time call to GetSize fail, or does it succeed by looking into the constraint Sized where .Size = 4?

I believe the answer is neither -- it succeeds without using the constraint. That is to say, it would still succeed even if that constraint were not present.

Because the parameter of C is a template parameter and T is a generic parameter, I think we need to know what GetSize(T) returns before we know the value of T in order for this to compile.

This statement disagrees with the way I understand we are leaning on #2153 .