Open mkustermann opened 9 months ago
That's a very neat optimization that I believe is quite possible, so my answer to both questions would be "yes". I also think it shouldn't take much time, and I'd be happy to implement it.
A related question is how useful such a bit on AsExpression
and IsExpression
would be to other backends.
@johnniwinther WDYT?
That seems like a valuable flag to have on AsExpression
and IsExpression
.
@mkustermann Would it be useful for your use-case to also mark checks against record types as "only structural" (for instance is (Object?, Object?)
?
I also think it shouldn't take much time, and I'd be happy to implement it.
See simple version in cl/353900 (as part of a bugfix in dart2wasm compiler)
@mkustermann Would it be useful for your use-case to also mark checks against record types as "only structural" (for instance is (Object?, Object?)?
That may be helpful yes, as we can have fast ways to check whether an object has a certain shape.
A related question is how useful such a bit on AsExpression and IsExpression would be to other backends.
Definitely useful for wasm, vm. Very likely also for dart2js /cc @rakudrama
@mkustermann I'm looking into this optimization now, and I have multiple design alternatives for the flags on IsExpression
and AsExpression
. Basically, we have two boolean parameters that we can variate the number and the meaning of the flags with: (1) whether we want type-kind specific flags or not, and (2) whether we want the distinction between operand-dependent and operand-independent shape checks or not.
The type-kind specific flags are more or less self explanatory: they indicate what kind of shape that should be checked. We have the following possibilities for non-simple and not-erased types in Kernel: interface types, record types, function types, and FutureOr types.
The flags that make distinction between the operand-dependent and operand-independent checks signal whether the optimizing flag computation took the static type of the operand expression into account. For example, in the check x is List<num>
we may conclude that checking for the runtime type of the value stored in x
being any List
is sufficient because we know that the static type of the expression x
is Iterable<int>
. In that case we say that the shape-only check is sufficient, and it is operand-dependent. In contrast with that, in the check x is List<dynamic>
we conclude that checking for the runtime type of the value stored in x
for being any List
is sufficient because List<dynamic>
is a supertype of any List
. In that case we say that the shape-only check is sufficient, and it is operand-independent.
Please let me know which of the alternatives would work the best for you.
Variant 0: We do want the type-specific flags, and we do want to distinguish between the operand-dependent and operand-independent shape checks
We add the following flags to IsExpression
and AsExpression
AST nodes in Kernel.
isOperandIndependentInterfaceTypeShapeCheckSufficient
isOperandIndependentRecordTypeShapeCheckSufficient
isOperandIndependentFunctionTypeShapeCheckSufficient
isOperandIndependentFutureOrTypeShapeCheckSufficient
isOperandDependentInterfaceTypeShapeCheckSufficient
isOperandDependentRecordTypeShapeCheckSufficient
isOperandDependentFunctionTypeShapeCheckSufficient
isOperandDependentFutureOrTypeShapeCheckSufficient
The following is an example of those flags being set.
class A<X> {}
class B<Y extends num> extends A<Y> {}
test(A<int> a, (int,) r, Function(num) f, FutureOr<int> fo) {
a as B<int>; // isOperandDependentInterfaceTypeShapeCheckSufficient = true
a as B<num>; // isOperandIndependentInterfaceTypeShapeCheckSufficient = true
r as (num,); // isOperandDependentRecordTypeShapeCheckSufficient = true
r as (Object?,); // isOperandIndependentRecordTypeShapeCheckSufficient = true
f as Function(int); // isOperandDependentFunctionTypeShapeCheckSufficient = true
f as Function(Never); // isOperandIndependentFunctionTypeShapeCheckSufficient = true
fo as FutureOr<num>; // isOperandDependentFutureOrTypeShapeCheckSufficient = true
fo as FutureOr<dynamic>; // isOperandIndependentFutureOrTypeShapeCheckSufficient = true
}
Variant 1: We do want the type-specific flags, but we don’t want to distinguish between the operand-dependent and operand-independent shape checks
We add the following flags to IsExpression
and AsExpression
AST nodes in Kernel.
isInterfaceTypeShapeCheckSufficient
isRecordTypeShapeCheckSufficient
isFunctionTypeShapeCheckSufficient
isFutureOrTypeShapeCheckSufficient
The following is an example of those flags being set.
class A<X> {}
class B<Y extends num> extends A<Y> {}
test(A<int> a, (int,) r, Function(num) f, FutureOr<int> fo) {
a as B<int>; // isInterfaceTypeShapeCheckSufficient = true
a as B<num>; // isInterfaceTypeShapeCheckSufficient = true
r as (num,); // isRecordTypeShapeCheckSufficient = true
r as (Object?,); // isRecordTypeShapeCheckSufficient = true
f as Function(int); // isFunctionTypeShapeCheckSufficient = true
f as Function(Never); // isFunctionTypeShapeCheckSufficient = true
fo as FutureOr<num>; // isFutureOrTypeShapeCheckSufficient = true
fo as FutureOr<dynamic>; // isFutureOrTypeShapeCheckSufficient = true
}
Variant 2: We don’t want the type-specific flags, but we do want to distinguish between the operand-dependent and operand-independent shape checks
We add the following flags to IsExpression
and AsExpression
AST nodes in Kernel.
isOperandIndependentShapeCheckSufficient
isOperandDependentShapeCheckSufficient
The following is an example of those flags being set.
class A<X> {}
class B<Y extends num> extends A<Y> {}
test(A<int> a, (int,) r, Function(num) f, FutureOr<int> fo) {
a as B<int>; // isOperandDependentShapeCheckSufficient = true
a as B<num>; // isOperandIndependentShapeCheckSufficient = true
r as (num,); // isOperandDependentShapeCheckSufficient = true
r as (Object?,); // isOperandIndependentShapeCheckSufficient = true
f as Function(int); // isOperandDependentShapeCheckSufficient = true
f as Function(Never); // isOperandIndependentShapeCheckSufficient = true
fo as FutureOr<num>; // isOperandDependentShapeCheckSufficient = true
fo as FutureOr<dynamic>; // isOperandIndependentShapeCheckSufficient = true
}
Variant 3: We don’t want the type-specific flags, and we don’t want to distinguish between the operand-dependent and operand-independent shape checks
We add the following flags to IsExpression
and AsExpression
AST nodes in Kernel.
isShapeCheckSufficient
class A<X> {}
class B<Y extends num> extends A<Y> {}
test(A<int> a, (int,) r, Function(num) f, FutureOr<int> fo) {
a as B<int>; // isShapeCheckSufficient = true
a as B<num>; // isShapeCheckSufficient = true
r as (num,); // isShapeCheckSufficient = true
r as (Object?,); // isShapeCheckSufficient = true
f as Function(int); // isShapeCheckSufficient = true
f as Function(Never); // isShapeCheckSufficient = true
fo as FutureOr<num>; // isShapeCheckSufficient = true
fo as FutureOr<dynamic>; // isShapeCheckSufficient = true
}
See simple version in cl/353900 (as part of a bugfix in dart2wasm compiler)
Thanks! I see it may not handle some of the type combinations, for example, the one in the example below won't be optimized even though it could be, but it gives a good idea.
class A<X> {}
class B<Y> extends A<Y> {}
test(A<Function(num)> a) => a is B<Function(num)>;
For backends it's somewhat irrelevant why a check can be simplified. It only matters that the check can be simplified. (So I think it's not important to distinguish whether the optimization can be made because it took the static type into consideration or not)
The kind of simplification depends on the DartType
:
<obj> is/as InterfaceType<...>
: It means we can avoid checking the type arguments<obj> is (Object?, Object?)
: It means we can check the record shape only, not the individual typesI think it makes sense to separate the two, as they are different concepts.
Guess this would then be Variant 1?
One additional thing that may(?) make sense to add a flag for:
T? value;
value is T;
value as T;
=> e.g. a IsExpression.isNullCheckSufficient
.
Guess this would then be Variant 1?
Thanks! I'll go with this one then. A related clarification: would it be better to have the set of four flags or an enum instead, where one of the values will signal that no optimizations can be made?
=> e.g. a
IsExpression.isNullCheckSufficient
.
Good point about nullability check! I'll make sure to include that.
I am skeptical that this will be much use to dart2js. I propose an alternative.
dart2js already detects the nullable-removal pattern x as T
and compiles this as x == null ? x as T : x
. The logic here is single test. It does not seem worth encoding that test in a serialized format.
Are there cases that the CFE could detect that are missed here?
Another way to express the 'shape' optimization would be to replace the type expression with one instantiated to bounds, or the structural equivalent. Often the only reason we write x is List<T>
is because promotion does not work for x is List
, but the front end could generate an IsExpression
that uses a maximal type instead of the written type (replacing List<T>
with List<Object?>
or List<dynamic>
). The main problem with using a maximal type for an as
check is that it will change the error message, which could be confusing.
In many cases these maximal types are already optimized (obj instanceof Class
).
Annotating IsExpression
and AsExpression
with flags solves only half of the problem.
is
and as
tests, not other tests, like inlined parametric covariant checksis Map<int, dynamic>
is faster than is Map<int, List<int>>
)dart2js and the VM both use some kind of compiled type testing stubs. This allows x as T
to be fast when the type is simple, including parametric covariance checks. For dart2js, the stubs are somewhat generated at run time. It is not clear how to connect the flag on AsExpression
to the lazy runtime generation of the stub.
A preferable scheme is the following
AsExpression
/IsExpression
nodes, or
- Compute a maximal equivalent type
I really like this! The maximal equivalent types cover a wider range of checks, and generating such helper types can be done with an extension of the algorithm I already had in mind for computing the bits. I think it’s a good idea to store the maximal equivalent types in IsExpression
s and AsExpression
s and preserve the level of precision of the runtime error messages.
I can see the appeal of the shape-check bits too: even if the maximal type is present, sometimes it can be helpful to know that only the shape of such type needs checking. Consider the following example.
class A<X> {}
class B<X, Y> extends A<X> {}
test(A<int> a) {
// Maximal equivalent type: B<dynamic, String>.
// isInterfaceTypeShapeCheckSufficient = false.
a as B<num, String>;
// Maximal equivalent type: B<dynamic, dynamic>.
// isInterfaceTypeShapeCheckSufficient = true.
a as B<num, dynamic>;
}
- CFE stores a maximal equivalent type on the
AsExpression
/IsExpression
nodes, or- CFE provides a utility for computing a maximal type from the static type of the value operand and the tested type, or
I see value in providing both, together with the shape-check bits or a shape-check enum value. @mkustermann, @rakudrama WDYT?
My main request is to have the functionality shared inpackage:kernel
instead of all backends doing their own (slightly different) implementation. If the functionality is there, then it's easy to compute the bits - so having them on the IsExpression
/ AsExpression
isn't so important.
The bits on the AST nodes would be a convenience - and wouldn't incur additional memory overhead as those nodes already an int flags
. Adding more information to the nodes makes them consume more memory. Not sure what impact that would have (and as already mentioned, it wouldn't cover all cases as e.g. inlining would need to re-compute this for the operand type on call site and the implicit as
check for parameter type)
In terms of precision: Only performing shape checks vs full type check covers many important cases, but it would of course be nice if it's more general.
Rewriting the existing type of the is/as check may loose information that backends may rely on in their type propagator. Maybe some backends have an advanced intersection type that can infer from Iterable<T> a; a as List<dynamic>;
that a
must be List<T>
but this isn't true for all backends - it may also make getStaticType()
on the nodes more expensive or less precise (?).
It is not clear how to connect the flag on AsExpression to the lazy runtime generation of the stub.
Can only speak for the VM: The flag actually tells what to do:
So maybe we can start out with having the algorithm in package:kernel
?
Rewriting the existing type of the is/as check may loose information that backends may rely on in their type propagator.
Yep, that would be an inconvenience. But I suggest to store the maximal equivalent type alongside the target check type, not to replace one with the other.
So maybe we can start out with having the algorithm in
package:kernel
?
Yep, we can certainly do that.
Sorry for the silent on the issue! Some time ago I implemented the boolean predicate @mkustermann asked, and it can be updated into an algorithm that produces the maximal equivalent types for the type checks too. @rakudrama is there still interest in such API?
@chloestefantsova Yes, a function to compute a maximal equivalent type would still be used.
For backends it's usually much simpler & faster to check whether an object is an instance of a class than to check whether it's an instance of an interface type with arguments.
Concretely, we have
<obj> is/as <type>
expressions where<type>
is anInterfaceType
but with non-defaults-to-bounds type arguments. Doing those checks is expensive.Now there's a subset of cases where we actually have a static guarantee that the type arguments will match and therefore don't have to be checked, reducing the check to a simple
obj instanceof Class
check.Here's a few examples where type argument checks are not needed:
The algorithm to determine whether type arguments don't have to be checked for a
<obj> is X<A0, ... An>
with<obj>
having static typeY<B0, ..., Bm>
could be something like this:X
, i.e.X<X::T, X::T2, X::T3>
X
toY
Y<Expr-1(X::T1..Tn), ..., Expr-m(X::T1..Tn)>
against the static type of operand (Y<B0, ..., Bm>
)X::T1
...X::Tn
Operand(X::Ti)
for allX::Ti
X<Operand(X::Ti), ..., Operand(X::Tn)>
is a subtype ofX<A0, ..., An>
obj
is anX
.So the concrete request I have is
package:kernel/*.dart
AsExpression
andIsExpression
with a bit indicating whether the type arguments don't need to be checked/cc @johnniwinther @chloestefantsova?