Support the ability for a user to define a type constructor

bradcray commented 1 year ago

Today, when a record or class is generic, the compiler generates a type constructor for it. However, in some cases, the compiler's notion of the type constructor may be overkill or not the way the user intended for the type signature to be written. For example, given:

record R {
  type t;
  param p: int;
  type tupType = p*t;
}

the compiler's type constructor would be something like:

proc type R.init(type t, param p: int, type tupType = p*t) {
  this.t = t;
  this.p = p;
  this.tupType = tupType;
}

Yet if the type author didn't want tupType to be configurable, and for the type signature to always take 2 arguments max, they should be able to write something like:

proc init(type t, param p: int) type {
  this.t = t;
  this.p = p;
}

in order to constrain the type to forms like R(int, 3), R(p=5, t=real), R(string, ?), etc.

This issue proposes that we add the ability for users to create such type constructors, which would override the compiler-generated type initializer. My current thoughts is that we should start by constraining the number of type constructors on a type to one for simplicity and because I think it will cover common cases where we've wanted/needed these for now. Over time, we may want to add the ability to create additional ones, though that carries other challenges along with it (see the comment stream starting at https://github.com/chapel-lang/chapel/issues/21456#issuecomment-1414441893 for relevant discussion).

In https://github.com/chapel-lang/chapel/issues/21456#issuecomment-1416089283, @mppf expressed that he believed this approach would also have benefits for fields of generic type:

Other Generic Fields

Chapel classes/records also can have generic fields declared like var x; or var y: SomeGenericType;. I think that these can be handled by a custom type constructor as well. In fact, requiring a custom type constructor for such cases would sufficiently address the problem described in #19120 (in my opinion). If we were to require a type constructor for such cases, can we also solve the default-initialization problem described in #16508 ?

Here is an example that I think demonstrates that it can solve both of those problems.
record XR {
  var x;  // note: this example applies equally well if this were 'var x: integral;'

  // custom type constructor
  // since it takes a generic type as an argument, it's easy to
  // see that this type is generic
  proc type init(type xType) {
    this.x.type = xType; // sets the type of 'x'
  }

  // Default initializer using Option 1 from above
  // (Named Arguments w/ Type Constructor Names)
  proc init(type xType) {
    this.x = 1: xType; // default initialize 'x' to '1' with the appropriate type
  }
  // -- or --
  // Default initializer using Option 2 from above
  // (Using this.type working with the generic field's names)
  proc init() {
    this.x = 1: this.type.x; // default initialize 'x' to '1' with the appropriate type
  }
}

There are a few open questions here:

[ ] Terminology: I have mentally traditionally called these "type initializers" because I expect they would be defined using init() and they are initializing a type (rather than a value—so I'd call our current initializers "value initializers" or "instance initializers" or "object initializers"). Michael argues that they should be called "type constructors" to make the distinction stronger and avoid the implication that "type initializer" indicates we're initializing an instance of the given type." See the two comments at https://github.com/chapel-lang/chapel/issues/21456#issuecomment-1406518863 for this argument in his own words.
[ ] Syntax: As a straw-person proposal, consider:
```
record R {
proc type init(type t, param p: int) {
  this.t = t;
  this.p = p;
}
}
```
This is a slight abuse of the type method syntax since we're not calling the initializer on anything; but of course, that's true of value initializers as well. And this within the body of the procedure does refer to a type, much as this in a value initializer refers to the object in question, so this seems symmetric in its slight weirdness.

Other less attractive alternatives include using type-returning method syntax:
- using proc init(type t, param p: int) type { ... }: This is also an abuse since we're not returning anything, and also a bit weird since it looks as though this should be a value
- using proc type this(type t, param p: int) { ... }: Since we're essentially adding support for applying arguments to (accessing) a type, we could lean on our value accessor syntax. Of course, this is already under consideration for a name change, so this is unstable soil, and it also feels unfortunate that it's not more symmetric to initializers.
[ ] Printing types: Also, in https://github.com/chapel-lang/chapel/issues/21456#issuecomment-1416742525, Michael points out that even if we restrict the user to a single type constructor, there can still be challenges to determining how to print out the type for cases that are created from a value initializer

vasslitvinov commented 1 year ago

Not to affect the push of this issue... Specifically for the example in the OP, making tupType not configurable is better expressed / more Chapel-tastic in today's Chapel using a parenless type method. Analogously in the world of values, if I do not want c to be configurable or stored at runtime in the example below, I am better off writing it as a parenless value method:

record R {
  const a, b: int
  const c = a * b;
}

Of course I could also go through the trouble of writing a value initializer that accepts only a and b, making c cache the multiplication.

mppf commented 1 year ago

Not to affect the push of this issue... Specifically for the example in the OP, making tupType not configurable is better expressed / more Chapel-tastic in today's Chapel using a parenless type method.

See also #12613. I think the tenor of the discussion there is that the current way of writing it as a parenless type-returning type method is unsatisfying. IMO having this type constructor strategy to make some type fields really just be defining a type alias is appealing.

bradcray commented 1 year ago

Note: Edited the OP to quote and refer to additional items brought up in https://github.com/chapel-lang/chapel/issues/21456 as Michael suggested in https://github.com/chapel-lang/chapel/issues/21456#issuecomment-1488904279

benharsh commented 1 year ago

Does this mean that we'd want to generally support passing ? as a kind of "any type/param-value" to such formals?

Would we want a way for users to indicate that ? cannot be passed to certain type or param formals? This could extend to a way of indicating that a partial instantiation or generic type could not be passed to certain type or param formals.

I think it could be unfortunate if type constructors had special rules regarding formals, at least if other methods could not take advantage of those rules. This kind of flexibility could be useful if users wanted to implement a function or method that wrapped a type constructor, though I can't think of a good example right this moment.

bradcray commented 1 year ago

In our generics meeting this week, Michael suggested that ? could be used as a way of creating a curried procedure call in general. So, for example, imagine:

proc mult(x: int, y: int) {
  return x * y;
}

var bad = mult(2);  // error: not enough arguments
var double = mult(2, ?);  // OK: mult is a curried function that looks like:  `proc mult(y: int) { return 2 * y);`
var twenty = double(10);  // OK
var fifty = double(y=50);

In such a world, you could imagine that for a type constructor like:

proc type R.init(type t, param p: int) {
  this.t = t;
  this.p = p;
}

expressions like R(real, ?), R(p=3, ?), R(t=string, ?) would all just be curried versions of the type constructor that could then have the remaining arguments filled in at some later point.

That said, I'm not confident that implementing the former would cause the latter to fall out given that partially instantiated types are a thing in Chapel. So where the former aren't ever meaningful or useful until the rest of the arguments are provided, in the type constructor case, we'd need to accept the partially-specified version, compare it to the value initializer to make sure things matched, etc.

But even if they can't completely share an implementation, it feels like the same concept to me, just with different constraints in the type vs. value setting.

vasslitvinov commented 1 year ago

I like this idea! One challenge with ? with generics is unification. For example, how to unify MyTypeFunction(int, ?) and MyGenericRecord(real, 3, ?) when the two are, indeed, unifiable?

bradcray commented 1 year ago

@vasslitvinov : I think Michael took a stab at that in the "Partial Instantiations: Impact on Regular Initializers" section of his comment at https://github.com/chapel-lang/chapel/issues/21456#issuecomment-1416089283 (?).

mppf commented 1 year ago

In our generics meeting this week, Michael suggested that ? could be used as a way of creating a curried procedure call in general.

I don't remember bringing this up, but that sounds OK at a conceptual level! However, if/when we try to implement user-defined type constructors, I would expect we would simply not allow partial instantiations with them at first.

One challenge with ? with generics is unification. For example, how to unify MyTypeFunction(int, ?) and MyGenericRecord(real, 3, ?) when the two are, indeed, unifiable?

I'm not certain I am thinking of the same issue but here is a code example along these lines:

record GR {
  type t;
}
proc typefn(type t) type {
  return GR(t);
}

assert(typefn(?) == GR(?)); // does this work?
var x: typefn(?) = new GR(int); // how about this?
var y: GR(?) = new (typefn(?))(int); // or this?

If typefn(?) represents a curried function (where it's just waiting for an argument) then we can't run it to generate GR(?). Similarly, if GR(?) represents a curried function, it would be a function rather than a generic type.

The way I think about this is, GR(?) or MyGenericRecord(real, 3, ?) are not really curried functions, but they are similar to curried functions. Here ? means "the generic any-type/unknown-type" and it can be provided to type / param arguments. But ? is a type, so functions called with it can be resolved. As a result typefn(?) would resolve to the result type GR(?).

chapel-lang / chapel

Support the ability for a user to define a type constructor #21992

Other Generic Fields