dart-lang / language

Design of the Dart language
Other
2.66k stars 205 forks source link

NNBD support for generic functions where the result nullability depends on if an optional parameter is passed or not #836

Open rrousselGit opened 4 years ago

rrousselGit commented 4 years ago

A common pattern in Dart is to have an "orElse" callback on functions that needs to have a fallback behavior.

A concrete example is the Iterable.firstWhere:

final value = [1, 2, 3].firstWhere((i) => <condition>, orElse: () => <fallback>)

Now, this is not specific to Iterable.firstWhere and is very common to Dart in general.\ NNBD doesn't cause any issue with Iterable.firstWhere specifically, as by default, it will throw if it needs a fallback but no orElse was provided.

The problem comes with a variant of that pattern, where instead of throwing, the default behavior of their orElse would be: () => null.

A typical implementation would be:

T doSomething<T>({T orElse()}) {
  // some logic

  if (orElse != null) {
    return orElse();
  } else {
    return null;
  }
}

The issue with that variant is that it is impossible to migrate to NNBD without some inconvenient change.

This pattern gets stuck between two un-ideal solutions:

So in the end, with NNBD enabled we either have pointless boilerplate or an incorrectly inferred type.

eernstg commented 4 years ago

One way to think about this is that we'd want two function types of the form X Function<X>({required X orElse()}) and X? Function<X>(), where the first one ensures that there is a way to get an X, and the second one does not provide this feature and consequently may return null.

In the first type we probably don't want to specify that the type argument is non-nullable, that is, X extends Object. If the type argument is nullable then the second function type could be used, but if we really want to pass an orElse argument and use a nullable type then it seems a bit too strict to make it an error.

But with two distinct types that we can't immediately merge to something that has all the desired properties, maybe they are just two different functions/methods?

One way to bridge the gap would be to use a case function (of course, we don't have case functions, yet):

X? doSomething<X>({X orElse()}) {
  X? case() => ...
  X case({required X orElse()}) => ...
}

This would allow call sites with sufficient type information to choose either case and get the desired return type (no nulls when orElse is provided and X is non-nullable), and it would preserve the slightly less precise typing `X? Function({X orElse()}) in cases where the call site does not provide sufficient type information to justify a static choice.

rrousselGit commented 4 years ago

The suggested syntax looks confusing. It is weird to have a return type defined on doSomething if it is ultimately dictated by the case inside.

But other than that, that's an interesting idea!

eernstg commented 4 years ago

The point is that we want to preserve the notion of having a single function (e.g., for dynamic invocations, or any usage that doesn't justify a choice among multiple entities with the same name), so there is an "overall" function, and it needs to have a return type and a run-time representation.

But then we may know more at particular usage points. When that is true, it is allowed for the compiler to resolve the invocation to the first case that matches and generate a call directly to a helper-function whose body is that case (so each case must also have a representation as an actual function at run time). This also allows us to rely on the specified return type for that case, and it is certainly possible that specific cases can have more specific return types than the overall function.

Of course, this implies that the cases must be part of the type of the function. We may or may not wish to enable this feature in function types of Dart in general. One approach which is less powerful, but maybe more comprehensible (and maybe more performant), is to say that case functions can only dispatch to specific cases when the function itself is statically resolved. One obvious step up would be to also support cases in instance methods, and allow for invocations to call a specific case, and then require all overriding declarations to preserve the set of cases and their ordering (but they could add new cases after the existing ones).

rrousselGit commented 4 years ago

How would that apply to an IDE's type preview or the is operator?

eernstg commented 4 years ago

That depends on the degree to which we include cases in function types. The most comprehensive level of support would allow all functions to be case functions, and all function types to contain a full list of cases. A "normal" function type would then simply be the ones we know today, and that would be a supertype of similar ones where we have added some cases.

rrousselGit commented 4 years ago

make it a breaking change, and instead of throwing by default, return null by default.

That issue is about the functions that already return null by default.

It's about those functions that default to returning null would have to make their return type always nullable (even when it definitely cannot be null) or make the fallback callback required.

rrousselGit commented 4 years ago

I agree.\ It's probably off-topic though. This suggestion should probably be made on https://github.com/dart-lang/sdk.

lrhn commented 4 years ago

One thought I had during the design process was to allow a default value that wasn't valid for all possible type arguments, and then you would have to provide a better value if the default value wasn't valid. So:

static Null kNull() => null;
T first<T>(Iterable<T> values, [T orElse() = kNull]) { ... }

Then a first<int>([1, 2, 3]) would be invalid because there is no valid argument for orElse, and the default is not valid, but first<int?>([1, 2, 3]) would be valid.

That idea doesn't work. The problem is that default values are not part of function types, so if I do:

T Function<T>(Iterable<T>) f = first;

then it's seemingly valid, but it doesn't preserve the safety of the orElse parameter.

Er definitely do not want the default value to be part of the function type, because that would make a lot of otherwise compatible function types be different.

(Doesn't even work if we use the default value's type in inference when not passing it, for the same reasons).

So, any proposal here, for a very real problem we have had in the platform libraries too, has to solve the problem of the default value not being visible in the function type.

Idea

Could we let the type of the default value be part of the function type? Since we only care about the type, the exact value might not be necessary.

Then T Function<T>(Iterable<T>, [T Function() = Null Function()] would be the function type of first.

Let's define this function type as having a potentially unsafe default value type because it has a default value type which is not a subtype of the parmeter type.

If neither the parameter type nor the default value type referred to a type variable, then there'd be nothing potential about it. The it's just a compile-time error. The default value would never be a valid argument.

If the parameter type or the default value type contains a type argument, either from the function or from a surrounding function or class, and the default value type is not a subtype of the parameter type, then we include the default value's type in the function parameter's signature.

So type1 Function([type2 = type3]) can be a function type if the potentially unsafe default value type type3 is not definitely a subtype of type2 (then it would definitely be safe), and when it's not definitely not (then it would definitely be unsafe). Such function types can only be invoked without the argument for that parameter if the default value type is a subtype of the parameter type at the invocation (when all the available type arguments have been supplied).

You can specify a potentially unsafe default value type that is completely unrelated to the parameter type, as long as at least one of them contains a type variable. We will not try to solve for whether it's possible to bind type variables such that the default value becomes valid.

We'd have to define subtype relations on such types. As usual, we'll want "soundly substitutable" as the underlying principle, so a subtype of the above type would be one which allows the same arguments.

Subtypes of T Function<T>(Iterable<T>, [T Function() = Null Function()] would include:

Potentially unsafe default value types in function types are covariant, and they are supertypes of corresponding safe function types.

Since all current functions are safe (you cannot declare a potentially unsafe default value in the current type system), introducing these extra function types should be non-breaking.

That is, until we start using them in the platform libraries. Even then, the constraints we'd introduce with NNBD won't break any legacy code because all legacy types are nullable.

I have not considered whether this introduces something bad into the type system (like, say, undecidability).

Summary:

You can declare function types with potentially unsafe default value types.

Type1 Function([Type2 x = Type3])

If Type3 is a subtype of Type2, this is just Type1 Function([Type 2 x]), and you'll get a warning or error if you write it anyway.

A function can be declared with a default value which is potentially not a subtype of the parameter type. Maybe that needs extra syntax

Null _kNull();
T first<T>(Iterable<T> values, [T orElse() <= _kNull]) {
  var it = values.iterator;
  if (it.moveNext) return it.current;
  return orElse();
}

The type of first is T Function<T>(Iterable<T>, [T Function() = Null Function()]). Since Null Function() is not a subtype of T Function() in NNBD-world, the former allowed as a potentially unsafe default value type for the latter.

Calling first, or anything with the same type, without a second argument is only allowed if the default value type is a subtype of the actual parameter type of the invocation.

Omitting the second argument means that the default value type can be used in inference:

eernstg commented 4 years ago

Cool idea! ;-)

One thing to think about: When we rely on 'the actual parameter type of the invocation' and that could be determined by the choice of actual type arguments passed to the callee when that is a generic function, inference would generally be able to choose a super-type for some of the type arguments, and thus make the difference between a valid and an unusable default value.

Null _kNull();

T first<T>(Iterable<T> values, [T orElse() <= _kNull]) => ...

void main() {
  var x = first([1, 2]); // Succeeds, passing `int?` to `first` and to `[1, 2]`.
}

The developer who writes this might be happy because "it works", but the ? on the type argument passed to the list literal may come back and bite us later (that won't happen in the concrete example, but it could happen if that list were stored somewhere and used later).

This could serve as a warning about tractability and comprehensibility issues with inference. So we might prefer to ignore the potential default value during inference, and then simply make it an error if the default value is an error with that typing, and no actual argument is provided.

The underlying mechanism is (1) in some context there is an option to specify a default value, (2) a default value is specified, but for some configurations (e.g., for some values of some type variables) that default value is an error, so (3) we just consider the default value to be provided when it's not an error, and omitted when it is an error.

We could use this kind of mechanism in several different contexts, preferably with some syntactic marker (such that we don't just silently accept wrong default values all over the place). For example:

abstract class A<X extends num> {
  X foo() <=
  int foo() => 42; // Default implementation.
}

class B extends A<int> {}

The default implementation of foo would be ignored when it is not a correct override of the abstract one declared by X foo(), and it would be inherited by all classes (such as B) where it is correct. So the class B is OK, even though it's concrete and doesn't say anything about foo, because it will inherit the implementation returning int, because that works for B.

The next step could be to have a list of candidates and taking the first one that works. And so on. And along the way we need to consider when the complexity cost outweighs the benefits, of course. ;-)

lrhn commented 4 years ago

The "actual parameter type" would still have to be the static type of the actual parameter of the function type being called, just after any known type variables have been instantiated.

For the nullable type coming back to bite us, I think null safety will actually make that issue go away. When the type of x is int?, the user will very quickly see that in the following code. It's not an accident, it's really the desired behavior that it becomes nullable when you do not provide a default.

It's a clever idea to use the same approach with interface methods or (please) interface default methods. If you are compatible, you get the default implementation, and if not, you don't.

This s also something we have always wanted for List.sort where the default value only works for comparable type parameters. If we had non-constant default values (still compatible with this idea because we only use the type), we could sort as void sort([int compare(E a, E b) = (Comparable<E> a, E b) => a.compareTo(b)) { ... }. Then you could only omit the compare parameter when the element type is comparable to itself. Nice!

lrhn commented 4 years ago

@tatumizer It will actually work because it will infer that R is int?. Then covariant generics allows you to pass a List<int> where a List<int?> is expected.

The inference for R first<R>(Iterable<R>, {R orElse()}) will have on context type, so it infers each parameter independently. That's a List<int> and a Null Function(), and the solution for R is then int?, which becomes the return type too.

Indeed, first([1, 2], orElse: ()=>"not an int") would succeed and return an Object.

rrousselGit commented 4 years ago

About the "default implementation that uses a different type", this would only partially support my use-case.

On freezed, I generate a when/map method for pattern matching.

Such that for:

@freezed
abstract class Example with _$Example {
  const factory Example.person(String name, int age) = Person;
  const factory Example.city(String name, int population) = City;
  const factory Example.country(String name, int population) = Country;

}

We have:

Example example;

String name =  example.map<String>(
  person: (Person person) => person.name,
  city: (City city) => city.name,
  country: (Country country) => country.name,
  orElse: () => '',
);

Now, in the ideal world, with NNBD map should allow three cases:

Right now, if I want to achieve the same thing with NNBD I need to define three different functions when they are effectively three times the same thing.

This leads to confusing naming (map vs mapOrElse vs mapOrNull)

eernstg commented 4 years ago

@tatumizer wrote:

how does the compiler know that orElse parameter possesses this magic property that its return type gets factored in in the return type of the overall function?

During inference of the actual type arguments for an invocation of a generic function, the context type may provide some constraints on the type variables based on the return type, and the static types of the actual arguments may provide some constraints based on the declared types of the formal parameters. So there is no special magic about taking the types of value arguments into account when inferring type arguments, which may in turn influence the return type of the invocation. However, the context type is given a high priority during inference, which may give the impression that the types of the value arguments are ignored during inference:

main() async {
  Map<X, Y> f<X, Y>(X x, Y y) => {};
  var map = f(1, true);
  print(map.runtimeType); // 'JsLinkedHashMap<int, bool>'.
  Map<num, double> map2 = f(2, 3);
  print(map2.runtimeType); // 'JsLinkedHashMap<num, double>'.
}

This shows that the types of the actual arguments fully determine the return type in the first call of f, but the context type actually makes 3 evaluate to a double in the second call. So information may go up or down, and the context type may "shadow" other things, but it's not a new thing per se to take the types of actual arguments into account during inference of type arguments to a generic function invocation.

You could say that it is an anomaly to infer a union type (like int?), and we actually don't infer "the other union type" (FutureOr), but we are able to get T? from inference today, with null safety:

void main() {
  f(bool b) {
    if (b) return null;
    return 42;
  }
  String s = f; // Error message reveals static type of `f` is `int? Function(bool)`.
}

So I don't think there's a need for new magic in order to handle the inference in the example.

eernstg commented 4 years ago

There is a certain amount of guesstimation involved in this discussion because the declaration of first that you show is a compile-time error today, assuming null safety.

If we allow it and give it the meaning that the default value is considered to exist unless the inferred type makes it an error then you are right that we would need to have some extra support during inference in order to take kNull into account during inference. If we don't do that then we will indeed just choose the type argument int in some cases and determine that there is no default value, so it is an error to omit the orElse argument. In other cases we would have int? from the context type, and it would "just work".

So the question is not so much whether

the return type of orElse gets mixed into the return type of the method

because that's a standard property of type inference: When we have chosen a value for T/R/X then we may also have chosen a value for part of the return type because that type variable occurs in the return type. The question is really whether we'd require type inference to take this situation into account: (1) regular inference caused an error because there is no default value, (2) use the "potential" default value to obtain further constraints on type variables, run inference again (and in that case, in the example, we could potentially choose int? and allow the call).

It's certainly possible that we should rather avoid putting this kind of extra smarts into type inference, because that might eliminate exactly those cases that were described as questionable in this discussion.

tomgilder commented 3 years ago

For anyone else finding this issue, nullable methods have been added, but you have to import package:collection:

import 'package:collection/collection.dart';

You can then use firstWhereOrNull:

collection.firstWhereOrNull((e) => e.isThing)

This was the first issue I hit with NNBD. Would it be worth adding something to the documentation for e.g. firstWhere about this?

Sunbreak commented 2 years ago

This was the first issue I hit with NNBD. Would it be worth adding something to the documentation for e.g. firstWhere about this?

Any update? Or if permitted I'd like document on it

munificent commented 2 years ago

No update and no plans on this as far as I know.

danielleiszen commented 2 years ago

I understand this issue is broader than just singleWhere or firstWhere, which are only edge cases for a conceptual problem. However I would like to mention that other languages like C# solve this by introducing redundant functionality for the two use cases, particularly Single and SingleOrDefault. Where the former requires a single match and the latter returns null in case of no match.

I do think this addition would be a solution for most of us here as well. Thank you.