Run-time type checks of protected extension types.

dart-lang / language

Design of the Dart language

Other

2.65k stars 202 forks source link

Run-time type checks of protected extension types. #1466

Open lrhn opened 3 years ago

lrhn commented 3 years ago

Taking this from the main discussion.

The current proposal disallows is E and as E where E is a protected extension type. However, there are places in the language semantics where implicit is/as checks are performed:

covariant by generics arguments
covariant by declaration arguments
downcast from dynamic.

I believe downcast from dynamic is already disallowed, E e = (o as dynamic); does not introduce an implicit downcast, it's just an error.

The other two are harder to avoid. Example:

protected extension type Ext<T> on T {
  Ext(T? value) => value!;
  void foo() {
    assert(this != null);
  }
}
main() {
  List<Ext<int?>> l1 = [];
  List<Ext<int?>?> l2 = l1;  // up-cast, valid.

  // The call is probably valid.
  // The run-time type is either completely impossible (which means `add` on any `List<ExtType>` doesn't work)
  // or it can at most check against the `on` type of `Ext<int?>`, which *is nullable*, even though the constructor
  // doesn't allow `null` to be accepted.
  l2.add(null);

  Ext<int?> bad = l1[0];  // Now has a `null` as `Ext` even though the constructor disallows `null`.
}

So, what should the run-time behavior of List<Ext>.add(o) be? Should it do an is test like other

Or should we just make extension types completely invariant when used as type arguments? Then a List<Ext> is-not a List<Object?> or even List<Ext?>, even though Ext is-an Object? and an Ext?. (Does that make sense at all, to make the variance depend on the type argument, not the type constructor?) If we do so, then a List<Ext> is always an exact type (at the argument type at least), and we can skip the covariant type checks in add, etc.

eernstg commented 3 years ago

Yes, it was not specified in the proposal, but I agree that an implicit downcast from dynamic to a protected extension type should be an error.

For the implicit type checks, I think we should add a rule: If a type variable X has the value E which is a protected extension type, then every type test will yield false, and every type check will fail (be it an explicit type check, as in o as E, or an implicit check, e.g., for an actual argument where the parameter type is covariant-by-class or covariant-by-declaration).

Note that we'd need some minimal core of use-site variance in order to be able to create such things as a List<E>: We should be able to allow literals, List<E> xs = [E(42), E.named(true, 'abc'), myE];, where myE has type E, but it gets difficult to do xs.add(E(42)), because we'd need to prove that the type argument of xs isn't Never in order to allow that invocation to omit the type check on the argument.

eernstg commented 3 years ago

See also https://github.com/dart-lang/language/blob/6a1f57b1684aed351d6ed8f6a637def29b2bb40e/working/1426-extension-types/feature-specification.md#casting-to-a-protected-extension-type, proposing that we use the getter bool get verifyThis to verify that any given instance satisfies all the constraints associated with the given extension type:

class IntBox { int i; IntBox(this.i); }

protected extension type EvenIntBox on IntBox {
  factory EvenIntBox(IntBox intBox) => intBox.i.verifyThis ? intBox : throw 'Error';
  bool get verifyThis => this.i.isEven;
  void step() => this.i += 2;
}

void main() {
  (IntBox(4) as EvenIntBox).step(); // Implicitly invokes `verifyThis`.
}

eernstg commented 3 years ago

why do you need a "constructor" EventIntBox?

True, with a verifyThis (or whatever the name would be) it is indeed possible to use verifyThis in an extension constructor. However, an extension constructor could also do other things, e.g., it could build a bigger tree from two smaller trees, and if those two smaller trees are known to satisfy some constraints than the bigger tree might be verified by a couple of local checks. If that same constructor would rely on verifyThis on the newly built tree then it might require a much larger amount of work.

So we probably don't want to make constructors use verifyThis in an implicit and unconditional manner.

We might still want to make it really convenient to do it, anyway:

class IntBox { int i; IntBox(this.i); }

protected extension type EvenIntBox on IntBox {
  bool get verifyThis => this.i.isEven;
  factory EvenIntBox(IntBox intBox) {
    intBox as EvenIntBox; // Implicitly invokes `verifyThis`.
    return intBox;
  }
  void next() => i += 2;
}

We could allow the return statement to have an expression of type EvenIntBox (as specified currently, an extension constructor must return an expression whose type is the on-type, but we could allow the extension type as well):

protected extension type EvenIntBox {
  factory EvenIntBox(IntBox intBox) => intBox as EvenIntBox;
  ...
}

leafpetersen commented 3 years ago

For the implicit type checks, I think we should add a rule: If a type variable X has the value E which is a protected extension type, then every type test will yield false, and every type check will fail (be it an explicit type check, as in o as E, or an implicit check, e.g., for an actual argument where the parameter type is covariant-by-class or covariant-by-declaration).

This doesn't work. This implies that List<E> l = []; l.add(E(o)); must fail.

I'm highly skeptical that you can make this work out. My baseline assumption is that either we box, or we accept that subsumption into object breaks abstraction. Abstraction in languages with abstract types relies on parametric polymorphism, and Dart's polymorphism is not parametric.

leafpetersen commented 3 years ago

Or should we just make extension types completely invariant when used as type arguments? Then a List<Ext> is-not a List<Object?> or even List<Ext?>, even though Ext is-an Object? and an Ext?. (Does that make sense at all, to make the variance depend on the type argument, not the type constructor?)

@lrhn Yes, I've thought about trying to move in this direction, but I don't know how to make it work out. And it still doesn't solve the issue, unless you feed this into the generic system. Example:

T castFrom<T, S>(S x) => x as T;
extension type E on int {
}
void main() {
  int x = castFrom<int, E>(E(4));
}

I don't see any feasible way to enforce abstraction for these in the Dart type system short of making them not be a subtype of Object (and hence entirely outside the existing generic system).

eernstg commented 3 years ago

@tatumizer wrote (about constructors used to create large structures with O(1) work):

Tree? What tree?

For instance, TinyJson models a kind of trees where each node can have an arbitrary number of children. If you have a constructor accepting two arguments of type TinyJson then you can create a new TinyJson in time O(1), no matter how big the trees are:

protected extension type TinyJson {
  factory TinyJson.mergeTwo(TinyJson left, TinyJson right) => <dynamic>[left, right];
  ...
}

If you insist on always calling a general verifyThis on every value which is about to have the type TinyJson then you'd have to traverse every node in <dynamic>[left, right], which would be O(size(left) + size(right)).

So we don't want to make it a property of the language that we always call verifyThis, we want to allow constructors like TinyJson.mergeTwo. Of course, the developer who writes such a constructor needs to get it right. For instance, we could break the invariant by returning <dynamic>[left, 'right']: This would create a tree containing a String, and that's a violation of the TinyJson invariant (that it is a nested list of numbers). But it's a given that the invariants maintained by this kind of construct are beyond the type system of Dart, and it will always be the case that preservation of such invariants is based on careful manual reasoning.

we normally think of constructor as a way to initialize a new instance

True, extension constructors are about introducing a new typing for an object which may or may not be new. However, they are (intentionally) marked as factory constructors, and they already behave in this manner: A factory constructor has always been able to return an object which isn't new.

I think this is not just tolerable but actually useful: There is no actual wrapper object (unless we box it), but we're doing a number of things using purely compile-time mechanisms to make it behave as if there were a wrapper object. In particular, we're replacing the set of members of the on-type by the set of members of the extension type, and that's exactly what we'd expect if we had used an actual wrapper object (and exactly what we'll get if we do box it).

With that understanding as the starting point, we still need to remember that there is no wrapper object and the mechanism is purely static, such that we understand that the illusion breaks down if we drop that static type (say, by upcasting to Object?, or by invoking members dynamically).

eernstg commented 3 years ago

@leafpetersen wrote (about casting to a protected extension type):

This doesn't work. This implies that List l = []; l.add(E(o)); must fail.

True. I addressed that in a couple of ways:

I mentioned here that we could use use-site variance to handle this:

protected extension type E ... {...}

void main() {
  var o = ...;
  List<exactly E> l = [];
  l.add(E(o)); // Will call a version of `add` that does not check the argument type dynamically.
}

Also, I mentioned here that I added a discussion section to the proposed feature specification about how we could support dynamic casts in a way where developers are still able to maintain a given computational discipline (that is, checking invariants that the type system can't understand).

protected extension type EvenIntBox {
  bool get verifyThis => this.i.isEven;
  factory EvenIntBox(IntBox intBox) => intBox as EvenIntBox; // Implicitly invokes `verifyThis`.
}

void main() {
  var o = ...;
  List<Object?> l = <EvenIntBox>[]; // Could be `List<EvenIntBox>`, but this works as well.
  l.add(o); // Implicitly invokes `verifyThis`.
}

My baseline assumption is that either we box, or we accept that subsumption into object breaks abstraction

We can of course choose to accept that. This is quite similar to what we'll get in the proposal if we use a non-protected extension type with constructors: Explicit constructor invocations can be used to enforce the invariant, but dynamic casts will only verify the on-type, no implicit invariant checks are performed. We could then define protected to mean one simple thing: the on-type is not assignable to the extension type.

The benefit offered by a mechanism like verifyThis over the approach where we box every usage of a given extension type E is performance: If we are working on many objects with static type E (especially object structures like trees or graphs where boxing must be done at each step of each traversal), and we are rarely casting an expression to the extension type, then we save the space and time used to allocate the wrapper objects, and we don't spend much time verifying the invariant. (And that invariant might well be guarded by bool.fromEnvironment("DEBUG"), such that we havebool get verifyThis => true;` in production code.)

eernstg commented 3 years ago

@lrhn then @leafpetersen wrote:

make extension types completely invariant

I don't see any feasible way to enforce abstraction for these in the Dart type system

Use-site invariance would do it.

If we don't add use-site invariance then we can still consider the verifyThis approach as a way to integrate these checks into the type system:

T cast<S, T>(S x) => x as T;

protected extension type E on int {
  bool get verifyThis => isEven;
  E(int i) => i as E;
}

void main() {
  E x = cast<int, E>(4); // `x as T` will implicitly invoke `verifyThis` and succeed.
}

You could claim that it is not acceptable to include general computation as a step in a type cast, but the nature of the type cast remains (it occurs at run time, and it may succeed or fail), and I don't think it's unreasonable to say that this is one possible way to integrate the invariant enforcement capability of E into the type system.

lrhn commented 3 years ago

With the verifyThis, I feel like we are mixing two things.

The verifyThis feels like a feature of a subset type. If we could define typedef Flag on String = {"yes", "no", 'maybe"}; as a type Flag which is a subset of String only containing the three values (other languages can do this), then a verifyThis-like functionality would fit right in. (A subset type is a sub-type of the underlying type, with assignment from Flag to String, but not in the other direction.)

If we have subset types, then we don't need to lock the subset behavior into the protected extension type feature, we could just defined a normal extension type on the subset type and have two orthogonal features.

I think it's a red herring to think of protected extension types as the way to do subset types, or in terms of it being a subset type. It's just a protected extension type on some other type which prevents arbitrary assignments in either direction, and because of that, as (and is which promotes) won't really work on the type, which might mean that it's a bad feature to begin with.

lrhn commented 3 years ago

Maybe it was my idea, and then I'm no longer convinced it was a good idea (or rather, that it was a good idea for that thing). I still think constructors are a good idea for creating values of the extension type from other types, but I'm not sure extension types are the right choice for making subset types.

Maybe it's better to make subset types a feature of their own, instead of linking them to extension types and forcing you to write constructors and verifyThis functions to filter the subset. The subset types might still use an indicator function as their definition, or that indicator function can be defined implicitly from a finite subset.

Trying to make two different use-cases both be matched by the same feature can be worse than having two different features, and then only matching them up when it makes sense, not every time. If the protected extension type doesn't require a subset, assigning from the on type is fine, and rejecting that for all protected extension types is unnecessary.

Not all subsets need new methods, sometimes a string is just a string. Requiring you to introduce a protected extension type for that, with show String and constructors and inability to cast back to String is not necessarily what you want.

lrhn commented 3 years ago

(I actually wanted delegation separate from structs, and and then to combine it with structs to get something close to protected extension types).

I was thinking of something like this for subset types, a backing type and a boolean function on that. It still makes it impossible to do constant is checks or as casts for the type, but allows dynamic uses. Maybe with an option to provide a finite set of values instead.

typedef nat = int if (this > 0);
typedef flags = String in {"yes", "no", "maybe"};

We can canonicalize the flag-like constant sets of values, so they actually use subset for subtyping (so typedef answer = String in {"no", "yes"}; defines a subtype of flags). The computed ones are always distinct types.

lrhn commented 3 years ago

I think the run-time cost of doing Object o = ...; if (o is String) ... could be at least as great as int i = ...; if (i is nat) .... The former does a type check, which isn't cheap in the general case (and worse for generic types). The latter should be inlined to if (i > 0) .... (Obviously, Object o = ...; if (o is nat) ... would combine the costs of is int and > 0).

It's never going to be zero-cost to do a type check, but for subset types, you can decide how expensive it's going to be by writing good test.

(I don't trust debug-only checks. They're great for debugging things that should never happen, but they provide no actual safety.)

leafpetersen commented 3 years ago

I'm skeptical that use site invariance is a reasonable solution to this. I believe that it means that there is no way to call a generic add function on a list containing extension type objects:

extension type E on int {}
void add<T>(List<T> l, T x) => l.add(x);
void main() {
  List<exactly E> l = [E(3)];
   add(l, E(4)); // Fails at runtime

You have to rewrite all of your generic code that you want to work with extensions to use exactness. Am I missing something?

I'm also skeptical that gating this feature on use site invariance is a good idea, but maybe.

leafpetersen commented 3 years ago

If we don't add use-site invariance then we can still consider the verifyThis approach as a way to integrate these checks into the type system:

I am intensely skeptical about the idea of adding a virtual method call to every generic type test. We need to be making Dart faster, not slower.

lrhn commented 3 years ago

@leafpetersen I can see the problem with the virtual call.

I think it was intended as a non-virtual call here. It only applies to extension types, so it's effectively an extension method invocation, which should be possible to inline where applicable. However, that means that it won't work with generics. If you can do List<nat> and expect the is E checks to call the extension's is operator, then it does become, essentially, virtual. We would probably only get extra overhead on checks against type variables, for all checks, we know ahead of time whether the type is a protected extension type/subtype or not. I'm assuming is E is already potentially more expensive than is int because we can't inline the knowledge about the type, but adding more will be an overhead, and it will be on commonly used features like List.add.

mraleph commented 3 years ago

I am intensely skeptical about the idea of adding a virtual method call to every generic type test. We need to be making Dart faster, not slower.

FWIW at least on the VM an expression x as T is in fact a virtual method call T.:type_testing_stub(x) and I think the same applies to dart2js since it migrated to new type representation.

So if you want you can add custom code into type checking logic for some types (e.g. add an invocation of a Dart method) without penalising all types.

leafpetersen commented 3 years ago

FWIW at least on the VM an expression x as T is in fact a virtual method call T.:type_testing_stub(x) and I think the same applies to dart2js since it migrated to new type representation.

So if you want you can add custom code into type checking logic for some types (e.g. add an invocation of a Dart method) without penalising all types.

I'm skeptical about this. I think you're suggesting to fold the custom code into type_testing_stub to avoid having to call two virtual methods on every generic type? But that in turn means that you can no longer rely on type_testing_stub being well-behaved. If you're allowed to fold in user defined code it could throw an exception, or modify global state, or return different values every time you call it. So you have to disable all code motion optimizations, CSE optimizations, dead value eliminations, etc for checks on generic types.

mraleph commented 3 years ago

So you have to disable all code motion optimizations, CSE optimizations, dead value eliminations, etc for checks on generic types.

Yes, that certainly is a bit of a concern. However I do feel like you could probably still keep a lot of these type checks unaffected via a combination of various optimizations (global flow for type arguments, plus some local static information). That's is not to say that I would be especially thrilled to implement this feature - my position I think aligns with yours is that language features should be trivially always optimisable rather than non-trivially maybe-optimisable.

mraleph commented 3 years ago

@tatumizer It is as you say a question that is hard to really answer in such abstract terms. When we talk about virtual call we should take into account not just the call itself but the effects it has on the surrounding code (disabled optimisations, spilling of live register values, etc). Similarly when we talk about wrapper object allocation, it not the allocation itself that is going to cost a lot (it's 11 instructions on ARM, of which 2 are memory loads and 3 are memory stores), but other factors (are you gonna allocate a lot? how is this wrapper going to flow the program? etc).

I think performance is much easier to discuss in more concrete terms.

eernstg commented 3 years ago

@leafpetersen wrote:

I'm skeptical that use site invariance is a reasonable solution to this. I believe that it means that there is no way to call a generic add function on a list containing extension type objects

If the extension type isn't protected then the run-time representation of the extension type is the on-type, so there won't be any new issues with List.add. So let's consider a protected extension type:

protected extension type E on int {...}
void add<T>(List<T> l, T x) => l.add(x);
void main() {
  List<exactly E> l = [E(3)];
   add(l, E(4)); // Could call `verifyThis`.
}

I am intensely skeptical about the idea of adding a virtual method call to every generic type test. We need to be making Dart faster, not slower.

I don't think anyone has proposed a mechanism doing that, the invocation of verifyThis would only occur when the type test or type cast has a target type which is a protected extension type. Note that the type test/cast does not run verifyThis in the case where a protected extension type occurs as a component of a composite type (for instance, it does not occur when testing o is List<E> or o is void Function(E)).

For any target type which is not a top type, it is statically known that the type test/cast does not have a target which is a protected extension type.

Of course, when the target type is a protected extension type then it is statically known that verifyThis will be executed, and the type test/cast can be made a compile-time error if verifyThis is not defined (thus getting the default bool get verifyThis => false; behavior early).

Only one case remains, but it is a very important case: Type tests/casts where the target type is a type variable whose bound is a top type (explicitly or by default).

@mraleph noted here that this case, at least on the VM, can be confined in the sense that the extra cost is incurred when the target type is actually a protected extension type, not for any other types.

@leafpetersen wrote:

you can no longer rely on type_testing_stub being well-behaved. ... you have to disable all code motion optimizations, CSE optimizations, dead value eliminations

Granted, code cannot be moved up in front of a type test that could invoke a verifyThis getter (we couldn't move code across a type cast anyway, if the code that moves can have side-effects, because the cast could throw). Also, a type test that might invoke verifyThis cannot be eliminated even in the case where the result is unused (this would apply to o is X; which is an expression statement). So it would be really helpful to know more about how expensive these changes are in practice.

I think the effect on cached outcomes of type tests is important, too. @mkustermann told me that there are several different kinds of caching in the VM for this kind of result.

A type test/cast where the target type is statically known to be a protected extension type (like o is E) is never cached, and we can generate code for o is E that doesn't use the cache (and it might inline verifyThis).

For a type test/cast to a type variable whose bound is a top type the VM would execute T.:type_testing_stub(x), and I suspect that the implementation of that method could just omit any attempt to use the cache when T is a protected extension type.

In summary, it doesn't look like it would be costly for Dart in general to support invocation of verifyThis as part of a type test/cast to a protected extension type. For any type test/cast whose target is not a type variable whose bound is a top type, it would be zero. But the crucial part is of course that the cost must be very nearly zero for a cast to a type variable whose bound is a top type. There is no limit to the amount of work that verifyThis can do, so if we perform that kind of type tests/casts on a billion objects or even a single one it could take an arbitrary amount of time. But this is under developer control. Also, I find it likely that any non-trivial verifyThis would use

protected extension ... {
  bool get verifyThis => bool.fromEnvironment("DEBUG")? realVerification : true;
  ...
}

leafpetersen commented 3 years ago

I don't think anyone has proposed a mechanism doing that, the invocation of verifyThis would only occur when the type test or type cast has a target type which is a protected extension type.

This is not statically determinable in general - therefore every type test against a generic type parameter incurs a virtual method call, exactly as I said.

For any target type which is not a top type, it is statically known that the type test/cast does not have a target which is a protected extension type.

No, not for generics, that's my entire point. T cast<T>(Object x) => x as T is the counter-example.

In summary, it doesn't look like it would be costly for Dart in general to support invocation of verifyThis as part of a type test/cast to a protected extension type.

I can only repeat that I am intensely skeptical of this, and nothing in the discussion you summarize has changed my mind about that. This feature would disable a huge category of optimizations for all generic type tests, which are currently exceptionally pervasive in Dart.

eernstg commented 3 years ago

@leafpetersen wrote:

This is not statically determinable in general - therefore every type test against a generic type parameter incurs a virtual method call

That is not quite true: The only situation where a <type> can denote a protected extension type is when it statically denotes a protected extension type, or when it is a type variable whose bound is a protected extension type or a top type. I think a type variable whose bound is a protected extension type is so useless that we wouldn't see many of them (X extends E can have the value E and Never, and nothing else). In all other situations it is guaranteed at compile time that the target type will not be a protected extension type.

However, a type variable whose bound is a top type is so common that it is crucial to ensure that type tests/casts with such a target type is not penalized.

In the VM, @mraleph mentioned that the type test in this case amounts to an invocation of T.:type_testing_stub(x) where T is the run-time representation of the target type and x is the object under scrutiny. In the case where T is a protected extension type E, type_testing_stub would need to perform a regular subtype test against the on-type of E and then it would run a specific statically known implementation of verifyThis (namely the one declared in the body of E). In the case where T is some other type, type_testing_stub would perform a subtype test against T. So we're comparing one virtual method call against one virtual method call, and the extra work done for verifyThis occurs only when the target type is a protected extension type (verifyThis could be inlined or it could be executed by a direct jump, i.e., no additional virtual method invocation). I don't see any penalty at all for types that aren't protected extension types during the actual type test/cast.

For any target type which is not a top type, it is statically known that the type test/cast does not have a target which is a protected extension type.

No, not for generics, that's my entire point. T cast<T>(Object x) => x as T is the counter-example.

Sorry, I wasn't quite precise there, but a few lines further down I said the following, which gives a hint that I meant 'not a top type, and not a type variable whose bound is a top type':

Only one case remains, but it is a very important case: Type tests/casts where the target type is a type variable whose bound is a top type (explicitly or by default).

However, on the VM, the expression x as T is handled by type_testing_stub as described above, so there is no penalty at the type test/cast itself, for any type that isn't a protected extension type.

This feature would disable a huge category of optimizations for all generic type tests, which are currently exceptionally pervasive in Dart.

The most serious issue I've noted is caching: The VM caches the outcome of type tests in a number of ways, and there will probably be a need to take one extra step during a type test/cast where the target type is a protected extension type in order to ensure that the result is not cached (because we can't assume that verifyThis returns the same result the next time).

It would be good to know more about the actual impact of missed optimizations. The support for verifyThis prevents moving code from after a type test to before the type test (we already can't do that with a type cast, so it makes no difference for implicitly generated covariance related type casts).

Also, it prevents removal of expressions of the form o is T when the result of that expression is not used.

Do you really think this amounts to a huge cost?

leafpetersen commented 3 years ago

The most serious issue I've noted is caching:

I don't know what you mean by caching here. My point is that unless global analysis is able to prove that for an expression E of the form e is T T cannot be a protected extension type, the following optimizations must be disabled for the expression E:

Loop invariant code motion - E cannot be moved out of a loop
Dead code elimination/CSE - E cannot be eliminated if it is dominated by another test(s) against the same type
General code motion - E introduces a potentially heap side effecting edge, whereas previously this was only a control flow side effecting edge. This prevents numerous other optimizations, because you can no longer re-order E with respect to any other potentially heap side effecting operation (for example, writes to non-local locations can't be store-load forwarded across a type test).

Do you really think this amounts to a huge cost?

I don't know. I suspect that it is not a huge cost for global compilers. I am confident that there is some cost, even if only in the form of opportunity cost for optimizations that we do not yet do. The key point is that the cost is not pay as you go. I'm fine with adding features which are themselves slow, if they provide corresponding value. I have an extremely high bar for adding features which slow down unrelated parts of the program. These kind of features are death by a thousand cuts. They add up, and you can't avoid them by not using the feature. As soon as anyone anywhere in your import graph uses a protected extension type, everything gets a little slower unless compiler magic is successful, and I've been working on compilers for almost 30 years now and have a healthy respect for the limitations of compiler magic (amazing as it can be!).

And all of the above is specific to the caveat "global compiler". For a modular compiler, there is no global analysis to help here - everything just gets slower (or bigger, if you speculatively duplicate the code in advance). We don't currently rely on modular compilation for production, but it's not at all clear to me that that is always going to be the case.

eernstg commented 3 years ago

@leafpetersen wrote:

I don't know what you mean by caching here

@mkustermann told me that the outcome of a test like o is T is cached rather aggressively (in various different ways, I don't know the details), and then the cache will be used whenever o is T is evaluated again. The obvious cache key would be the run-time type of o and the type T, but @mkustermann mentioned more than one kind of key. When T is a protected extension type with a non-trivial verifyThis, we cannot use this cache.

the following optimizations must be disabled

Indeed. However, I don't see why we would refer to a global analysis in order to conclude that

T cannot be a protected extension type,

The target type T in e is T can only be a protected extension type when T directly denotes a protected extension type and when T is a type variable whose bound is a top type or a protected extension type. So the frequent and hence crucial case is still a type test against a type variable with no bound.

It would be possible in some cases for a global analysis to prove that a given type variable without a bound will never have a value which is a protected extension type, but I would expect that kind of property to be so fragile that it should basically be ignored. Especially for classes like List where any little update could introduce a List<E> where E is a protected extension type, or a List<X> where X is a type variable whose value could be a protected extension type.

(Implicitly induced type casts caused by dynamically checked covariance or covariant parameters would not be relevant, because they are performed at the very beginning of a function body, and it doesn't improve performance to move any code up in front of such casts. Also, we can't move any code with effects across an action that might throw.)

In order to get an impression of the number of locations I looked through the code of sdk/lib and found 20 occurrences (including two from ddc_runtime that may or may not be relevant):

collection/splay_tree.dart:324:        _validKey = isValidKey ?? ((dynamic a) => a is K);
collection/splay_tree.dart:808:        _validKey = isValidKey ?? ((dynamic v) => v is E);
async/future.dart:771:        test: (Object error) => error is E && (test == null || test(error)));
async/future_impl.dart:185:  bool shouldChain(Future<dynamic> value) => value is Future<T> || value is! T;
_internal/js_runtime/lib/internal_patch.dart:17:  return isLegacySubtyping || null is T;
_internal/js_runtime/lib/collection_patch.dart:412:      : _validKey = (validKey != null) ? validKey : ((v) => v is K);
_internal/js_runtime/lib/collection_patch.dart:764:      : _validKey = (validKey != null) ? validKey : ((v) => v is K);
_internal/js_runtime/lib/collection_patch.dart:1148:      : _validKey = (validKey != null) ? validKey : ((x) => x is E);
_internal/js_runtime/lib/collection_patch.dart:1624:      : _validKey = (validKey != null) ? validKey : ((x) => x is E);
_internal/vm/lib/internal_patch.dart:24:bool typeAcceptsNull<T>() => (const <Null>[]) is List<int> || null is T;
_internal/vm/lib/collection_patch.dart:17:  bool test(v) => v is T;
_internal/js_dev_runtime/private/ddc_runtime/types.dart:412:  bool is_T(obj) => obj == null || JS<bool>('!', '#.is(#)', type, obj);
_internal/js_dev_runtime/private/ddc_runtime/types.dart:415:  as_T(obj) => obj == null || JS<bool>('!', '#.is(#)', type, obj)
_internal/js_dev_runtime/private/custom_hash_map.dart:71:    if (key is K) {
_internal/js_dev_runtime/private/custom_hash_map.dart:98:    if (key is K) {
_internal/js_dev_runtime/private/custom_hash_map.dart:160:    if (key is K) {
_internal/js_dev_runtime/patch/internal_patch.dart:15:    !dart.compileTimeFlag('soundNullSafety') || null is T;
_internal/js_dev_runtime/patch/collection_patch.dart:353:    return element is E && JS<bool>('!', '#.has(#)', _map, element)
_internal/js_dev_runtime/patch/collection_patch.dart:454:    if (key is E) {
_internal/js_dev_runtime/patch/collection_patch.dart:468:    if (key is E) {
_internal/js_dev_runtime/patch/collection_patch.dart:506:    if (key is E) {
internal/iterable.dart:868:      if (_source.current is T) return true;

leafpetersen commented 3 years ago

So the frequent and hence crucial case is still a type test against a type variable with no bound.

Yes, that is the case that I'm referring to.

Implicitly induced type casts caused by dynamically checked covariance or covariant parameters would not be relevant, because they are performed at the very beginning of a function body, and it doesn't improve performance to move any code up in front of such casts.

This is not true. Inlining moves the checks into enclosing scopes, in which all of the optimizations that I describe apply.

eernstg commented 3 years ago

@leafpetersen wrote:

Inlining moves the checks into enclosing scopes

Ah, of course! But it is still not correct to move code with effects across a type cast, and I can't see how that situation would be worse if the type cast can invoke verifyThis, except for one rather specific situation. For example:

class C<X> {
  void m(X x) {}
}

var j = 0;

void main() {
  int Function() f(C<Object> c) {
    var y = 1;
    try {
      do {
        c.m(Object()); // `dart2js` does inline this as a mere cast, `t1._as(new P.Object());`.
        y = 2; // Even an assignment to a local variable can be observable after throwing.
      } while (++j <= 10);
    } catch (_) {}
    return () => y;
  }

  g = f(C<Object>()); // The cast doesn't throw.
  print(g()); // '2'
  var g = f(C<int>()); // The cast throws.
  print(g()); // '1', would be '2' if we move `y = 2` across the cast.
}

So we cannot move/hoist expressions that may have effects across a type cast, and this limitation applies for both regular type casts and type casts that may invoke verifyThis.

We could evaluate 2 outside the loop, but that's not hugely helpful. ;-)

So the loss of optimization opportunities associated with type casts basically amounts to hoisting an expression e that is statically known to have no effects (including: can't throw, e.g., because of division-by-zero, or index-out-of-range), but whose evaluation may depend on mutable state (such that invocation of verifyThis could change the result of evaluating e). It's still not obvious to me that this would have a very large impact.

leafpetersen commented 3 years ago

Ah, of course! But it is still not correct to move code with effects across a type cast, and I can't see how that situation would be worse if the type cast can invoke verifyThis,

@eernstg I have to admit I'm finding this discussion fairly frustrating. I described very concretely the optimizations that do apply here, and it is trivial to construct programs right now where the VM does in fact do the optimizations that I am describing, and for which it would be invalid to do so were this proposal accepted. To reiterate:

For an instance check (and this is not an exhaustive list):

An instance check may be freely moved forwards or backwards across any code so long as dataflow deps are respected (including
An instance check may be coalesced via another instance check with the same type (or a subtype) which dominates it
Redundant instance checks along paths may be coalesced onto a dominating path
Redundant instance checks along paths may be coalesced onto a dominated path

For a cast (and this is not an exhaustive list):

A cast may be freely moved forwards or backwards across any side effect free code
A cast may be coalesced with another cast with the same type (or a subtype) which dominates it
- Including the very important cast where the dominated cast is in a loop
Redundant casts along paths may be coalesced onto a dominating path
Redundant casts along paths may be coalesced onto a dominated path

None of these need be specialized optimizations targeting casts. They are simply the natural optimizations that fall out when you apply LICM, CSE, tail merging, PRE, etc to operations with statically known effect sets. This proposal moves instance checks from the most optimizable effect category (dataflow) to the worst (all effects), and casts from the second most optimizable effect category (idempotent effects) to the worst (all effects).

Per your example, looking at dart2js optimization is never going to be useful here - dart2js does few optimizations around casts, because they are elided in prod (cc @rakudrama ). I doubt the VM will optimize your example either (@alexmarkov can correct me if I'm wrong), but in principle this is a fairly easy LICM opportunity if the IR has the right level of granularity. Specifically, your inner loop:

      do {
        c.m(Object());
        y = 2;
      } while (++j <= 10);

can be lowered + inlined to:

      do {
        if (!isSubtype(Object, T) throw CastError;  
        y = 2;
      } while (++j <= 10);

Note that both Object and T are loop invariant, so this is equivalent to the following:

      if (!isSubtype(Object, T) throw CastError
      do {
        y = 2;
      } while (++j <= 10);

I don't know if our native compilers represent this at the level of granularity necessary to separate out the type test from the allocation, but they could, and might in the future. This optimization would be invalid under this proposal.

Here is another simpler example:

class Pair<T> {
  String toString() => "<$first, $second>";
  T first;
  T second;
  Pair(this.first, this.second);
}

void setTo(Pair<Object?> pair, Object? fill) {
  pair.first = fill;
  pair.second = fill;
  return;
}

Suitably wrapped to make sure that this code is in fact runtime polymorphic and is not inlined, the VM nonetheless successfully eliminates the second dominated implicit cast. This would not be possible under this proposal (again, except by relying on global analysis to ensure that this feature was not used).