carbon-language / carbon-lang

Carbon Language's main repository: documents, design, implementation, and related tools. (NOTE: Carbon Language is experimental; see README)
http://docs.carbon-lang.dev/
Other
32.33k stars 1.48k forks source link

Checked generics calling templates #2153

Closed josh11b closed 1 year ago

josh11b commented 2 years ago

Question: Given a function TemplateFunc with a template parameter, defined either in Carbon or C++, when is it legal for a generic function GenericFunc to call it?

fn TemplateFunc[template TT:! TC](x: TT);
fn GenericFunc[GT:! GC](x: GT) {
  TF(x);
}

Since TemplateFunc has a template parameter, it may have a monomorphization error, where the error is not detected when TemplateFunc is defined, but when it is called with a concrete type value for TT. This is expected, since templates are specifically for the cases when the definition of TemplateFunc cannot be checked on its own (or C++ interop). Constraints (like TC) are not required on template functions, but having them, assuming they match the actual conditions for template instantiation to succeed, "shifts errors left / earlier in the call stack." This narrows down where the error could be, improving the user experience. Named constraints may be reused, avoiding repeating constraints and maybe getting out of sync.

Background: Previously this question was investigated in #136 .

Note: I am assuming that there are two kinds of constraints:

Options from most permissive to least:

  1. Always allowed, but may generate a monomorphization error in GenericFunc if the concrete GT used to instantiate GenericFunc does not satisfy the constraints on TT. "Calling a template exposes you to monomorphization errors, even from generics."
  2. Always allowed, but the constraint GC may optionally include value constraints, including TC. This would give GenericFunc the option of including TC in the constraints on GT. Non-generic callers of GenericFunc would validate the value constraints like TC. Generic callers of GenericFunc would either ignore value constraints like TC, or evaluate them at monomorphization time, potentially triggering monomorphization errors in the caller. Value constraints don't give additional capabilities on GT in the body of GenericFunc. "Generic functions may optionally include template constraints to shift monomorphization errors left/earlier in the call stack."
  3. Allowed only if GC, the constraints on the type of x passed to TemplateFunc, includes all constraints in TC. Value constraints would have to be propagated by generic types. This means generic functions would not have monomorphization errors from calling templates, but template constraints would have to be added to callers transitively. "Generic functions include constraints of every template it calls." Option: We could introduce an escape hatch where a value constraint may be asserted in a generic function, allowing a template function to be called at the cost of the assertion possibly failing at monomorphization time.
  4. Never allowed.

Discussion about option 3:

Some care will be needed to handle cases where the condition on the generic is not the same as on the template, but I currently believe these cases are solvable. One case of this: the type parameters of the calling generics might be transformed before being passed to the template. For example, GenericFunc could pass a value of type GT* or Vector(GT) to TemplateFunc, which affects what is written in the constraint GC.

Out of scope: There is some need to define an interface if there are multiple possible definitions with the same name that could be called, due to overloading or specialization. That is something to consider separately.

geoffromer commented 2 years ago

...a lot of existing C++ templates don't have constraints yet.

Since our interop/migration target is C++17, don't we need to assume that all C++ templates lack constraints?

zygoloid commented 2 years ago

Regardless of what GC and TC are, and regardless of which option we select, I think we can always rewrite the example as:

fn TemplateFunc[template TT:! TC](x: TT);
fn Wrapper[template T:! Type](x: T) {
  TemplateFunc(x);
}
fn GenericFunc[GT:! GC](x: GT) {
  Wrapper(x);
}

I think GenericFunc must successfully type-check here, if we want templates to be usable from generics at all. So any requirement to propagate constraints upwards can always be broken by inserting an unconstrained template to "localize" the monomorphization errors. I think this means we're a little freer to make the choice that we think is right, because there's an escape hatch.

Philosophically: when I write a generic function with constraint [GT:! GC], I mean for the function to work for every GT that satisfies GC; this is a major difference from [template GT:! GC], which means the function should work for some GTs that satisfy GC but not necessarily all such types. So if the generic contains a call to a template that is known to not work for some GT that satisfies GC, then I think we're justified in rejecting that.

That to me suggests that (3) is the right option: we should only allow a generic to call a template if all of the template's constraints are satisfied by the generic's constraints. The reason is: if there is a constraint C on the template that is not required by the generic, then (we assume that) there can exist types which satisfy the constraints on the generic but do not satisfy C, and so (we assume that) there can exist types that satisfy the generic's constraints but for which the generic would not "work". This is true whether C is a value constraint or a symbolic constraint.

I think this would mean that a generic cannot call a template that has value constraints unless those value constraints are known to be satisfied regardless of the generic parameters.

That said -- it's not clear to me that C++20 concepts can't be modeled symbolically in Carbon. Specifically, given a C++20 concept X<T>, it seems to me that we could model that in Carbon as a symbolic T:! X constraint, but that constraint would have an empty interface and would not be implementable by Carbon programs, so you'd still only be able to use functionality that the concept promises to exist from within a template.

josh11b commented 2 years ago

This was discussed today in open discussion

There were questions about what happens if an operation is performed on a generic value with value constraints on it, as in:

class Array(T:! Type, template N:! i64 where .Self >= 0);

fn PassesThrough(N:! i64 where .Self >= 0) {
  // Okay, N has the necessary constraint
  var a: Array(i32, N);
}
fn OperatesOnN(N:! i64 where .Self >= 1) {
  // ???
  var a1: Array(i32, N - 1);
  // ???
  let M:! (i32 where .Self >= 0) = N -1;
  var a2: Array(i32, M);
  // Will do template checking, will succeed at generic type checking time, could theoretically fail during monomorphization
  let template P:! i32 where .Self >= 0 = N -1;
  var a3: Array(i32, P);
}

This argument was found to be suspicious:

The reason is: if there is a constraint C on the template that is not required by the generic, then (we assume that) there can exist types which satisfy the constraints on the generic but do not satisfy C, and so (we assume that) there can exist types that satisfy the generic's constraints but for which the generic would not "work". This is true whether C is a value constraint or a symbolic constraint.

Reasoning: It seems like in general it is hard to determine whether conditions A and B imply C, whether you are talking about types or values.

Options 1 and 2 were found to be very similar. In option 2, constraints would appear in the signature of a function. In option 1, those same constraints could be enforced in the first line of the function by calling a templated StaticAssert function. Option 1 might be more desirable if the constraints aren't really enforced at type-checking time when a generic calls a generic.

josh11b commented 2 years ago

Recently option 3 has been discussed with the idea that value constraints would be named. A named value constraint would be called a "predicate." The only reasoning about constraints would be "type has the right set of predicates". Example:

// This is just placeholder syntax

// NonNegativeInt64 is an `i64` type that must be >= 0.
predicate NonNegativeInt64 = (x: i64) { return x >= 0; }

// Template version that works for any type
predicate NonNegative = (x: auto) { return x >= 0; }

// `NonNegativeInt64` equivalent to `i64 & NonNegative`,
// but couldn't use one to substitute for the other.

// NotTooBig(T) is an `i64` with a limit that depends on the size of `T`.
predicate NotTooBig(T:! Sized) = (x: i64) { return x * T.Size < (1 << 48); }

// `N` parameter constrained by two predicates
class Array(T:! Sized, template N:! NonNegative & NotTooBig(T));

// Generic function with predicate constraints
fn CreatesAnArray(N:! NonNegative & NotTooBig(i32)) {
  // Allowed, `N` has the required predicates
  var a: Array(i32, N);

  // Not allowed, operations lose predicates.
  var error1: Array(i32, N + 1);

  // Not allowed, wrong predicates.
  var error2: Array(bool, N);

  // Presumably the escape hatch would look something like this
  // (which could fail at monomorphization time):
  let template NPlusOne:! NonNegative & NotTooBig(bool) = N + 1;
  var b: Array(bool, NPlusOne);
}
chandlerc commented 2 years ago

In addition to what @josh11b mentions, a couple of other points that came up...

One: the actual checking of these predicates are intrinsically "template" phase (during late-bound type checking). Before that, the only modeling is through the name itself as a symbolic predicate.

Two: the utility of (3) is merely to facilitate propagating predicates from an inner (typically template) context where it is functionally required (the implementation of Array above) through any checked generic code to the concrete code where the diagnostic is most trivial to present to the user. It doesn't change expressiveness or enable anything beyond compiler verification that checked generic layers reliably propagate predicates. That compiler checking comes at the pretty direct cost of making independent evolution of predicates transparent: at whatever layer we begin to enforce predicates are propagated, changes to them become potentially breaking changes for callers much like changes to checked generic type constraints are potentially breaking changes for callers.

In essence, this is a tradeoff between early checking of predicates and enforced propagation once checked vs. trivially evolving templates (and intervening generics) to begin explicitly checking predicates that already happen to hold. The first provides some "shifted-left" developer experience benefits when writing code that fails to satisfy a predicate. The second provides some ease of deploying more strict explicitly checked predicates.

My initial instinct is to prioritize propagating predicates and checking them earlier, as tightening these kinds of predicates tends to be more rarely done and an undertaking that can afford to work through various workaround techniques to manage the incremental rollout. But maybe I'm wrong about that. I think both models are workable, and it also seems conceivable for us to switch in the future. Although, it seems somewhat less work to start by enforcing predicates and then relax the rules than start w/o that enforcement and introduce it later.

geoffromer commented 2 years ago
var b: Array(bool, N + 1);

Did you mean this to be

var b: Array(bool, NPlusOne);
josh11b commented 2 years ago
var b: Array(bool, N + 1);

Did you mean this to be

var b: Array(bool, NPlusOne);

Yes, I did. I've edited my comment to fix that.

josh11b commented 1 year ago

Discussed in open discussion today.

chandlerc commented 1 year ago

I think both @zygoloid and I are happy with option (3).

A key clarification here, is that constraints on a template binding are checked when initializing that binding, but after it is bound as a template, it becomes dependent and anything further is late checked.

The result is that we don't need a special escape hatch, this comes with one built-in, but we expect it to look a bit different from what was described previously. The example of an escape hatch should be something like:

  let template NPlusOne:! auto = N + 1;
  var b: Array(bool, NPlusOne);

More details of this in the open discussion that @josh11b linked.

Marking this as resolved since I think we have enough for leads consensus.


Attempted capturing of rationale for the remaining issues discussed: