dart-lang / language

Design of the Dart language
Other
2.67k stars 205 forks source link

Function internal parameter type override. #1311

Open lrhn opened 4 years ago

lrhn commented 4 years ago

In short, a parameter can have an added as type which changes the type of the local variable introduce by a parameter internally in the function, but doesn't change the declared type of the parameter in the function parameter. Example:

iterableOfObject.forEach((x as String) => ...);
void someFunction({int optionalParameter as int? = 0}) => ...;

The as type goes after the identifier, or after the end ) for function-style parameters, and before default values for optional parameters.

The as type type should be related to the declared (or inferred) type, either a subtype or a supertype.

If the as type type is a subtype of the declared type, it adds an run-time down-cast of the argument to the as type type on function entry (in declaration order if there is more than one as for the same function), just as for covariant functions. The parameter's local variable's declared type is the as type type inside the body of the function.

This can be used as a solution to #477. Example:

var map = Map<int, String>.fromIterable(iterableOfInts, key: (x as int) => x, value: (x) => "$x");

(Because we don't have generic constructors yet, the type the key function must be K Function(Object), even if you know the iterable's element type).

If the as type type is a supertype of the declared type, then there is an implicit up-cast (meaning nothing) and the parameter variable simply has the as type type as its declared type.

In both situations, an optional parameter can have any default value which is a subtype of the as type type, not the declared type. That allows a function to have an optional parameter which is not null, but which can be omitted, even when it's not possible to provide a valid default value for the parameter type. Example:

foo({int x as int? = null}) => ...

won't leak in the externally visible type that null is used internally to represent an absent parameter, and won't allow foo(x: null).

Allowing default values outside of the declared parameter type makes it detectable whether the parameter was passed or not. It's not possible to directly pass the default value, so the presence of the default value implies that the original function invocation omitted the argument for that optional parameter. Being able to distinguish whether an argument was passed or not has historically caused problems for forwarding functions. Consider a function foo({int x, int y}), and a function bar({int x, int y}) which wants to forward its invocation to foo. Currently foo would need to have default values for each parameter, then bar could use the same default values and call foo(x: x, y: y) directly. If foo could choose a default value that cannot be passed, say null, then bar would need to be written as:

bar({int x as int? = null, int y as int? = null}) =>
  x == null 
    ? y == null ? foo() : foo(y: y)
    : y == null ? foo(x: x) : foo(x: x, y: y);

This affects our own generated "noSuchMethod-forwarders", mixin application constructor forwarders, and other forwarding stubs, and it affects user code needing to forward parameters. We might want to consider adding a way to conditionally pass an argument, say => foo(x: if (x != null) x, y: if (y != null) y), but that probably needs more design work if it also has to apply to positional arguments.

eernstg commented 4 years ago

Cool! It is really nice that this allows for better forwarding.

One thing, though: Is T arg as S where T <: S only useful when S is T?? It seems likely that we'd have a situation where the only possible (initial) values of arg is a value of type T or the default value, and that leaves the rest of the difference between T and S unused. So it makes sense when that difference is exactly the null object.

lrhn commented 4 years ago

I actually does not allow for better forwarding. :pout:

You could probably also do something with int/num:

int findSomething(List<int> things, [int min as num = double.infinity]) {
  for (var v in things) if (things > min) return v;
  return somethingElse;
}

or any other wider superclass where you can meaningfully add a default from another subclass of the chose superclass. Maybe just to allow a sentinel value:

void foo<E extends Object?>({E value as Object? = const _Sentinel()}) { 
  if (const _Sentinel() == value) { 
     ... no argument ... 
  } else {
    value as E;
    ... with argument ...
  }
}

Here we are adding a sentinel which works even if E is nullable, and even if there is no way we can create an object which implements E.

lrhn commented 4 years ago

An argument against isInitialized was also that it made initialization checkable, and therefore less optimizable. Implementations were not free to eagerly initialize variables if they could analyze the program and determine that it would always be analyzer to the same (constant) value along the current path (hoisting the assignment, effectively). Arguably, the implementation could still do that if there is not use of isInitialized.

I'd probably be positive towards late{_hasX} final int x; introducing a _hasX getter to check whether the variable is initialized or not. Then optional parameters could also use that foo([late{hasFoo} int foo]) => .... Then you are opting in to the state check at the declaration, and you are in control of the name. The compiler is still free to decide how the state is represented. Would play into late {hasX} int {get x, set _x}; declaration syntax proposal, as a way to introduce x, _x= and hasX members in a single declaration.

I agree that if you're going to use a sentinel value, it would be cleaner to let the system do it for you, without having to introduce a sentinel value yourself.

lrhn commented 4 years ago

I'd probably not make the initialization check trigger initialization. Avoiding that initialization is exactly why I sometimes avoid late variables. (Say, allocating a buffer inside a loop only if it's needed, and then consuming that buffer at the end if it was created).

So having late {hasBuffer} buffer = StringBuffer(initialExpression); would allow a later if (hasBuffer) return buffer.toString(); return original;.

I think simply omitting eager initialization optimizations in cases where the author seems to care about the initialization is going to be fine. The analyzer can warn if a late variable is not needed to be late.

lrhn commented 4 years ago

No semantic paradoxes needed. The meaning of late buffer = ... is to not initialize buffer until it's first read, same as for existing static and top-level variables, but now also available for local variables.

The hasBuffer check then tells you whether the variable has been given a value, which in this case means it has been read (well, or written, since I didn't make it final). If hasBuffer is false, the variable has no value, which also means that the initializer has not been evaluated.

So, if you declare:

late {hasBuffer} buffer = StringBuffer(something);

then that introduces two variables into the scope: buffer (of type StringBuffer) and hasBuffer (of type bool). The hasBuffer is not assignable, but buffer is not declared final, so it is assignable.

Reading hasBuffer tells you whether buffer has a value. Reading buffer while it doesn't have a value will evaluate the initializer, assign the value to the buffer variable and update hasBuffer. Reading it while it has a value just evaluates to that value. Writing to buffer while it doesn't have a value will bind buffer to that value and update hasBuffer. Writing it while it has a value just updates the value.

All of these operations are orthogonal. They work the same whether the late variable has an initializer or not. Reading an uninitialized late variable with no initializer throws, as if it had an initializer of throw UninitializedError(...)), and a final late variable with an initializer won't be assignable. You can do final late {hasValue} int value;. (We can allow a negative test instead of final late {!canSetValue} in value; too).

So, I totally can avoid the word "initialized", I just say "has a value" instead. "Initialization" is just a word we use for the first time something gets a value.

I'm arguing against a function named initialized because it's unclear where and how it's defined (is it a top-level function, exported from dart:core, is it a reserved word, or is it something injected into any scope with a late variable? Is it a function at all? Can it be closurized? Or is it more like a syntactic macro, and then why do we give it a name, and not just do ?x?) By allowing the user to give it a name, you can choose whether it should be part of the API of a class (for late instance variables) and which name to refer to it by. It's clear where the name is introduced.

leafpetersen commented 3 years ago

I remember discussing this general idea on the white board in AAR a long time ago as part of the discussion of covariant. It's a nice generalization. It's unfortunate that we'd end up with covariant essentially subsumed by this.

lrhn commented 3 years ago

I wouldn't let this subsume covariant. This is a local-only change of a variable's type, it's not inherited by subclasses, and it doesn't change the "outer type" to match the inner type statically.

So, with covariant:

class C {
  void foo(covariant num x) {}
  void bar(num x) {}
  void baz(int x) {}
}
class D extends C {
  void foo(int x) {}
  void bar(num x as int) {}
  void baz(int x as num) {}
}

the static types of D.bar and D.baz are not affected by the as, but the static type of D.foo is void Function(int).

If we use a wider type to allow a different default value, then we don't want the caller to pass something of that type.

If we use a narrower type just because we know we're only going to be called with that type, even though a wider type is required, it's because we know something the type system doesn't, so statically being the narrower type would break what we're trying to achieve.

lrhn commented 3 years ago

@tatumizer

What problem are we trying to solve?

Several.

Basically, it tries to distinguish the local variable introduced by a function parameter from the function parameter itself.

lrhn commented 2 years ago

The VoidOr sound like an Option type. An Option<T> either has a value (of type T) or it doesn't (nominally of type Never).

You (and the runtime system) can tell which one it is, and passing an Option<T> as an argument either passes the value, or it passes no value, which is only valid if the corresponding parameter is optional.

It's just ... we have a way to represent an optional value already, it's nullability. So what if passing null was the same as not passing an argument to an optional parameter. The only thing you wouldn't be able to do is to distinguish an actual null argument from an omitted argument. And that's by design, we only have one level of optionality. Otherwise you'll have to introduce your own nestable Option class, which people have done.

lrhn commented 2 years ago

The tricky part here is that VoidOr<T> occurs where a type can occur (as the type of a function parameter). That either makes it a type (which it probably isn't) or a property of the parameter (more like a modifier, voidy T foo, than part of the type), but I think it won't affect the function function type. (Otherwise we'd need to answer whether int Function(int) is a subtype or supertype (or both or neither) of int Function(voidy int).)

So, basically, the parameter is treated like a variable which is potentially assigned, where normal parameters are always considered definitely assigned. That's actually a reasonable approach to optional parameters, you just need some way to use them, otherwise they are completely inaccessible. If we could ask a potentially assigned variable whether it's been assigned, then we could do the same here. (We currently can't, and I don't really want it, because we'd just be introducing more boolean flags that the compiled code has to keep updated, and that the compiler might not be able to optimize away).

The way you allow this potentially assigned variable to be used is in the argument position of an optional parameter. If it occurs there, as a tail value of the expression, an un-assigned parameter will become an absent argument.

So, I see your idea, and I'll raise you with:

Then you can write:

void foo({int foo, int bar}) => bar(foo: if (??foo) foo, bar: if (??bar) bar);

One change is that you can now omit non-trailing positional arguments. (We definitely do not want to move the following arguments up in the argument list, that'd be impossible to type).

Not sure I want this, but it's a coherent approach to detecting unassignedness. (It's introducing an undefined value as a second kind of null value, effectively, but with no way to actually access it or use it as a value). The ?? operator can probably also be used for late variables.

Also

The problem is that as soon as you introduce null as a first class object, it's very difficult to construct a consistent theory where null also means a kind of non-object that stands for something else. (This sounds vague, could be better explained on examples)

I totally get it. When null is a possible value (thing stored in variable), it can also mean no semantic value to the user of the variable. In most cases passing around null works out fine because a null just means "no semantic value", even if it is a normal physical value. For copyWith, the argument has two different meanings, either it's a new stored value or it means "preserve the existing stored value". Since the semantic domain for copyWith is the entire stored value domain, which can include null, we can't use null in the argument to represent the "preserver existing stored value" meaning. Or rather, we can and do, but then we can't provide null as a new stored value, representing a new "no semantic value".

And that's annoying, but very much limited to, well, copyWith or similar functions, where we are not just passing values around, but passing the union of semanitic values and signals through the same parameter. Other languages manage because they have overloading, and can therefore detect the absence of a parameter by going through a completely different code path if you omit the parameter. Dart has optional parameters with default values, and cannot do the same. Unless we introduce a way to distinguish whether a parameter was passed or not, which has severe repercussions on the ability to forward parameters. We'd then need optional arguments too, which is what (I think) you are aiming for here.

And I still think a viable approach is to have copyWith({Foo? foo = null, bool clearFoo = false}). It's not perfect.

This proposal, for allowing the parameter to have a wider type internally, includes a way to detect omitted arguments, by widening the type internally, and having a default value separate from any user-provided argument, and it has no way to optionally not pass an argument inside an argument list, so it fails my requirement above. Too bad, because it has other advantages :(

(About nulllable-meaning-optional:

You mentioned this idea on multiple occasions but haven't provided any details

Working on it, in my sparse spare time 😁 )

lrhn commented 2 years ago

So VoidOr<T> is effectively a parameter of type T with a corresponding local variable of type T | Undefined, where Undefined holds some otherwise inexpressible value (and it's presumably a compile-time error to use a variable of that type, or of the raw type Undefined, in any way other than in type tests/casts - which can then promote to T or some subtype of T and make the variable usable.)

Since you write x is int at a point where x has type VoidOr<int>, x must have a value. The expression x is int evaluates x for its value before checking. As long as we make darn sure that a value of undefined cannot leak, and cannot be used in any way, then that should be safe.

It's still "another null", just a very narrowly scoped one. It avoids introducing any new language features, other than allowing a default value of Undefined for a variable typed int, and the restrictions on using any type touched by Undefined (which are stricter than the rules on void, but not stricter than I think void should have been.) The rest is handled by normal type checking and promotion (and can therefore be blocked by the same things, like closures capturing the variable).

My idea of using "definitely assigned" instead of a value requires new operators, but not a new value and type (which is possibly not a a subtype of Object, which is why it can't be used anywhere. Just like null used to be.) Which sounds simpler depends on whether you are more scared by new syntax or by touching the type system.

lrhn commented 2 years ago

I don't think that is a showstopper as much as it's an "You have to document this behavior now, so users know not to shoot themselves in the foot".

The behavior is fine, users of that function should just know that omitting early arguments will stop computation at that point. Completely well-defined and predictable semantics, it's just one extra edge case that you couldn't hit before. That code would also exit early if you pass an explicit null.

Levi-Lesches commented 2 years ago

This discussion sounds like it's trying to solve the undefined/null conflict, for which there are several issues already opened. One simple solution was the combination of allowing non-constant defaults for parameters, and allowing the keyword default to mean "the default value for this parameter", instead of null which would mean "not passed".

Levi-Lesches commented 2 years ago

I didn't mean to propose a whole solution in a small comment, I was just pointing to the other discussions on null/undefined and that this issue doesn't really seem to be about that (specifically, the top comment mentions null/undefined and shrugs it off by saying "choose a sentinel value" and that other solutions should be fleshed out in another issue).

eernstg commented 2 years ago

@Levi-Lesches wrote:

allowing the keyword default to mean "the default value for this parameter",

We might need something slightly more powerful than a single default keyword, but the ability to refer to the default value of a given parameter in a maintainable manner (that is, other than looking up and copying the value itself) is what I always propose when we have these discussions (no matter which year ;-).

I'm not so happy about the "null means not passed" idea, because the type of a parameter could admit the value null as a properly passed value, and it could still have a different default value. So we'd need funny exceptions about the case where the parameter type is nullable, or the case where it is nullable and there is an explicit default value, or the case where it is nullable and there is an explicit default value different from null, or all of the above.