How to overload / specialize / dispatch based on the expression phase of the argument

chandlerc commented 1 year ago

This is largely my memory & summary of an open discussion from a few weeks ago. It started with all of @josh11b, myself, and @zygoloid, but most of it was between just @zygoloid and myself so we didn't have notes taken live. I'm trying to capture everything I can though, as I think this gives enough context to begin really digging into this as a leads question.

Summary

We expect a common pattern where functions may be evaluated with either compile-time or run-time parameters:

fn CTMax(X:! i32, Y:! i32) -> i32;

fn RTMax(x: i32, y: i32) -> i32;

Note: there is a related question about how to mark that the function can be fully evaluated at compile time and produce a compile time result. But this issue is more focused on selecting between implementations based on different input expression phases.

Beyond an explicit pair of functions which vary in the expression phase of their arguments, it is important to also consider whatever interface system in generics is used to generically dispatch to some callable entity (a function, lambda, etc). So far, this has been described as the Call interface.

After many discussions, it feels to me like there are three core directional options that emerge:

1) The Call interface is runtime only. If needed, explicit function overloads can overload on compile-time phase, but that uses its own mechanism and is explicitly not modeled in the generics layer.

2) Base dispatch and writing of the call interface on the typesystem by fully modeling compile-time vs run-time. This would mean first changing such that types say whether values that inhabit that type are compile-time or run-time in nature, which is also expected to obviate the need for :! in bindings. And then leveraging that as the basis for overloading.

3) The Call interface, specifically what parameters may be passed to it in a call expression, cannot be represented as a sequence of types. As a consequence, we must build a richer & more expressive way to describe this interface.

While there are a lot of important details to work out about each of these three in order to pursue it, it seems useful to understand which of these directions we're trying to pursue. At the least, choosing between these three directions seems like a reasonable request from the leads.

I'll write up some thoughts about each of these direction in subsequent comments, but leaving the summary as just the high level direction and encouraging others to edit that summary to capture things as much as possible.

chandlerc commented 1 year ago

Looking at option (1), it is very close to the status-quo in Carbon. It is also very much the approach of C++ currently. So in some ways, this is the easiest direction to pursue. And C++ does seem to be coping OK with it.

However, C++ has also seen a steady and rising amount of pressure from the community to leave this position and become more expressive in precisely these ways.

chandlerc commented 1 year ago

Looking at (2):

One observation is that there is another direction that may seem distinct but I think is usefully included in (2), let's call it (2b): Only types are compile-time, and values that need to be compile-time are exactly those values where the type carries the value fully. For example typeof 42 in Carbon's current design for integer literals.

One concern that I had with all of the options in (2) is that I think they will end up motivating a bifurcation of the type system and vocabulary types. For each container or vocabulary wrapper, we'll need a runtime and a compile-time flavor.

One possible way to make this more amenable is to have a qualifier like comptime or ! that can be applied to a type name in order to easily name both flavors of types. However, that raises the question of how to handle when the definitions of the types need to be substantially different.

chandlerc commented 1 year ago

Looking at (3):

Richard and I talked pretty extensively about this option. Specifically, we explored what it would look like to deeply embrace this direction and try to get something powerful.

What we came up with looked something like the following impl:

impl ... as Call(fn_type(a: i32, var b: i32, T:! type, x: T)) where .Result = i32;

But maybe with a call keyword (or some other keyword) to specially name the call interface, and have it be directly followed by a pattern just like a function declaration would be:

impl ... as call(a: i32, var b: i32, T:! type, x: T) -> i32;

The end result is that a function declaration:

fn F(a: i32, var b: i32, T:! type, x: T) -> i32;

Is just syntactic sugar for something lower level:

class F_type {};
impl F_type as call(a: i32, var b: i32, T:! type, x: T) -> i32;

let F:! F_type = {};

Some observations about this direction:

Signatures become first-class entities in the language
impl selection delegates to custom rules when selecting for a signature
- this is where we have whatever rules we want for overloading -- maybe you can overload on :!, but not on template or var.
- can model other signature relationships like virtual dispatch
requires a way to reify a generic signature into a function declaration & definition
maybe a path to replace like with tools build on signatures and call.
need to build a coherence model for signatures
- likely by generalizing the binary like rule -- for an N-ary call, N-1 args may implicitly convert and there must be one remaining that anchors for coherence
- overloading takes the place of "specialization"
Allows us to be even more able to model exactly what C++ does with ADL but in a principled way in Carbon on top of generics
- These signatures and the coherence rules are pretty much an exact model for how to have ADL without ODR and with clear rules for what is allowed.

We also saw some possibilities to further iterate the syntax to try to get to the old goal of having dedicated support for single-function interfaces with a single name, and the result just being part of the structure. But that's not essential.

(I think I've recovered all of this, but please let me know if I missed something. Images of the whiteboard notes which contain a few more examples can be found here: https://docs.google.com/document/d/1gnJBTfY81fZYvI_QXjwKk1uQHYBNHGqRLI2BS_cYYNQ/edit?resourcekey=0-ql1Q1WvTcDvhycf8LbA9DQ#bookmark=id.6zetfzq4wsme)

chandlerc commented 1 year ago

Lot's of fresh discussion from today's open discussion session, decent notes (huge thanks to all!!) here: https://docs.google.com/document/d/1gnJBTfY81fZYvI_QXjwKk1uQHYBNHGqRLI2BS_cYYNQ/edit?resourcekey=0-ql1Q1WvTcDvhycf8LbA9DQ#heading=h.x7xa7wz11td3

I'll try to provide my summarized take-aways here, happy for others to add theirs if I've missed or put too much of my feeling into this...

Super high level: everyone was pretty excited to pursue (3). There are some challenges there, but seems worth trying to address them. And seems better to do this now than wait. We also came to some agreement that (2) had too many problems to pursue at this point / in this form.

Read-on for more details...

Re-iterated that there are three useful categories of functions to consider, which for now we are often naming using C++ terminology but we might want our own terminology:

Runtime-only functions: typically just "functions" in C++
Runtime or Compile-time functions: constexpr in C++
Compile-itme only functions: consteval in C++

We talked about (2) quite a bit. One interesting point is that we already have some aspects of this emerging as useful things:

Types are expected to largely have no runtime operations or usage -- you'll have to do something explicit to get any runtime metadata even, much less a runtime erased type.
Some things are runtime-specific like the runtime heap.

Generally, it seemed like there was some motivation here, but when we dug into it, there were some really interesting issues.

A constexpr-style function evaluated at compile time and taking an i32 is very different from a function taking a compile-time i32. Both of them need a compile time argument, but how they internally use that compile-time i32 vary greatly -- the constexpr function wouldn't use it as part of a type for example but the other might do so reasonably.
Having compile-time-ness as a binary property of the type breaks compositional type checking
- Consider Vector(i32!) (where i32! is the spelling of a compile-time-only i32) -- you expect to be able to add compile time integers and extract them from this container. But what you get back can't be a compile time integer!
- Even something like Vector(typeof 3) where typeof 3 is a compile-time type which only has one value, 3 inhabiting it. You still can't extract a typeof 3 from this, because it might fail.
The simple ways of being generic over all types aren't actually sufficient to be generic across runtime and compile time. Would need more constraints to model this, and might be confusing for users who don't expect to have to work with that.

Adding these interesting insights to some of the prior ones (the bifurcation of vocabulary types), everyone seemed really happy saying "no" to (2).

We also talked some about (3). General excitement about the direction.

One issue is whether this is syntactic sugar or a primitive. To a certain extent, may not matter too much, but there were arguments in several directions.

One concern raised is that we may find that there is a tension where we want selecting an impl for a signature to use a really different model than what we have desired for types. General agreement this could cause a bit of a schism and that would be bad. But goal is to avoid that, and try to keep the rules as unified and integrated as possible.

One thing we need to figure out how to make these constructs actually work in all the relevant ways. For example, how do you match a constraint and an impl?

interface I(X:! Signature) {
}

class C {};
impl C as I(callable (a: i32));
// Want a call that can match C given an argument list, not a signature.
fn F(a: i32, T:! I(call(a))) { ... }

This idea resulted in us thinking about needing to articulate both an abstract "call" argument list, and an abstract "callable" pattern list that would match those calls.

So when we have F(a, b), we turn that into F as call(a, b), and look for an F with impl callable whose pattern matches a and b like callable(x: i32, y: i32).

josh11b commented 1 year ago

Just a comment that while I think that we do want to pursue option (3), I don't think that should block #2875 which currently implements option (1), other than to note that there is more work to be done to incorporate pattern information beyond type in the Call interface or overload resolution (see https://github.com/carbon-language/carbon-lang/pull/2875/files#r1310931830 ).

carbon-language / carbon-lang

How to overload / specialize / dispatch based on the expression phase of the argument #3154

Summary