Closed a-siva closed 3 years ago
Since this issue was filed, I think runtimeType has been intrinsified. The numbers are better now: 0.9 ms and 154 ms for ReleaseIA32 on my laptop (no access to my workstation right now). But the gap is even wider.
We have to build each generic type, finalize it, and canonicalize it. On the other hand, there is only one possible non-generic type, so it can be cached.
We only bypass finalization and canonicalization of the receiver type of a generic class at compile time. For List, it would be List
The problem here is not compile time, but run time building of a type. We build the generic type from the class of the instance and the canonical type arguments of the instance in order to pass a canonical type to the == method that will take it apart again. One way to optimize this would be to pass the class and the type arguments as a non-canonical pair (not a canonical type) to the == method.
Let me look at this in more details and at least verify the theory with profiling.
My earlier theory seems correct, except that the generic run time type does not need to be finalized, only canonicalized. Type finalization of a generic type consists mainly in building the type argument vector. We can here reuse the canonical type arguments of the instance and bypass finalization.
If I skip canonicalization, I observe a factor 3 in speedup (46 ms down from 154 ms). However, two equivalent but non-canonical types must now test equal. This requires overriding operator == in class _AbstractType and call the native AbstractType::Equals VM implementation, which is able to handle non-canonical types.
I am a bit weary of doing this, because it prohibits using types as key values of maps (keys should not implement ==), unless we treat this as a special case in the VM and suppress the error. But are there unpredictable effects of allowing non-canonical types floating around?
Let me know what you think.
Here is a git diff for illustration: diff --git a/runtime/lib/object.cc b/runtime/lib/object.cc index d8693a1..a777dbd 100644 --- a/runtime/lib/object.cc +++ b/runtime/lib/object.cc @@ -343,6 +343,14 @@ DEFINE_NATIVE_ENTRY(Object_as, 4) { }
+DEFINE_NATIVE_ENTRY(AbstractType_equals, 2) {
return Bool::Get(type.Equals(other)).raw(); +} + + DEFINE_NATIVE_ENTRY(AbstractType_toString, 1) { const AbstractType& type = AbstractType::CheckedHandle(arguments->NativeArgAt(0)); diff --git a/runtime/lib/type_patch.dart b/runtime/lib/type_patch.dart index 67d7db5..667703f 100644 --- a/runtime/lib/type_patch.dart +++ b/runtime/lib/type_patch.dart @@ -6,6 +6,8 @@
// Equivalent of RawAbstractType. abstract class _AbstractType implements Type {
diff --git a/runtime/vm/bootstrap_natives.h b/runtime/vm/bootstrap_natives.h index 38a92de..da141c0 100644 --- a/runtime/vm/bootstrap_natives.h +++ b/runtime/vm/bootstrap_natives.h @@ -31,6 +31,7 @@ namespace dart { V(FunctionImpl_equals, 2) \ V(FunctionImpl_hashCode, 1) \ V(FunctionImpl_clone, 1) \
Implementing hashCode for _AbstractType does not make any difference. The error about a constant map key implementing == is still being reported. However, it can safely be suppressed for types, as it is for strings and integers. Then, all tests pass with the exception of a VM test specifically checking that runtime types are canonical (test TypeGetParameterizedTypes expects Dart_IdentityEquals to return true).
//cc @crelier do we still have something to do in this issue or can it be closed as obsolete ?
@a-siva We do not have any pending work in this area. Generics will never be free. Yes, I would close this issue.
Here is a slightly updated benchmark:
class Base {
int a1 = 0;
}
class A1 {
int a1 = 0;
}
class A2 {
int a1 = 0;
}
class A3 {
int a1 = 0;
}
class A4 {
int a1 = 0;
}
class A5 {
int a1 = 0;
}
class A6 {
int a1 = 0;
}
class A7 {
int a1 = 0;
}
class A8 {
int a1 = 0;
}
class A9 {
int a1 = 0;
}
@pragma('vm:never-inline')
verify(v1, v2) {
if (v1.runtimeType == v2.runtimeType) {
print("Incorrectly stating that runtime types match");
}
}
final a0 = new Base();
final a1 = new A1();
final a2 = new A2();
final a3 = new A3();
final a4 = new A4();
final a5 = new A5();
final a6 = new A6();
final a7 = new A7();
final a8 = new A8();
final a9 = new A9();
verifyTypes() {
verify(a1, a0);
verify(a2, a0);
verify(a3, a0);
verify(a4, a0);
verify(a5, a0);
verify(a6, a0);
verify(a7, a0);
verify(a8, a0);
verify(a9, a0);
}
final l0 = new List<Base>.empty();
final l1 = new List<A1>.empty();
final l2 = new List<A2>.empty();
final l3 = new List<A3>.empty();
final l4 = new List<A4>.empty();
final l5 = new List<A5>.empty();
final l6 = new List<A6>.empty();
final l7 = new List<A7>.empty();
final l8 = new List<A8>.empty();
final l9 = new List<A9>.empty();
verifyGenericTypes() {
verify(l1, l0);
verify(l2, l0);
verify(l3, l0);
verify(l4, l0);
verify(l5, l0);
verify(l6, l0);
verify(l7, l0);
verify(l8, l0);
verify(l9, l0);
}
runtest() {
var watch = new Stopwatch();
watch.start();
for (int i = 0; i < 100000; i++) {
verifyTypes();
}
watch.stop();
var duration = watch.elapsedMilliseconds;
print("Time taken = $duration");
watch.reset();
watch.start();
for (int i = 0; i < 100000; i++) {
verifyGenericTypes();
}
watch.stop();
duration = watch.elapsedMilliseconds;
print("Time taken with parameterized types = $duration");
}
main() {
while (true) {
runtest();
}
}
Note that there is still quite some difference between parameterized and non-parameterized classes, though much less if we are running in AOT mode:
$ out/ReleaseX64/dart_precompiled_runtime /tmp/bench.aot
Time taken = 12
Time taken with parameterized types = 106
Time taken = 4
Time taken with parameterized types = 96
...
$ out/ReleaseX64/dart /tmp/bench.dart
Time taken = 23
Time taken with parameterized types = 919
Time taken = 10
Time taken with parameterized types = 916
AOT is faster because we have added a fast path for a.runtimeType == b.runtimeType
in AOT mode (for Flutter) which actually avoids calling runtimeType
altogether, however it still falls into runtime to handle type argument vector comparison. We can intrinsify it further (e.g. add fast path for comparison when we compare full type argument vectors - which should be just a pointer check as we expect them to be canonicalised).
Good point! It should not be too complicated to improve the intrinsics to handle type arguments. Also, now that we have proper functions types, comparing the runtime types of closures could be intrinsified to a pointer comparison. Of course, this would increase code size. Is it worth it?
I might have gotten the code pattern from Ian when they were optimizing the check for whether a widget tree needs to be rebuilt or not.
Of course, this would increase code size. Is it worth it?
I would not expect too much of an impact on code size, given that we will just improve the intrinsic itself.
Maybe we could take some Flutter apps (e.g. Gallery or some internal app) and see if we ever hit slow path for _haveSameRuntimeType
? If we do hit slow path often enough then we can see if a few more checks on the fast path can make it work without falling to the slow path.
If we don't then maybe we can close this.
I would be interested in the results of the analysis suggested above. It would indicate whether we should look into something similar for dart2js.
Do you know how often one of the arguments to _haveSameRuntimeType
is this
?
I tried to make the RuntimeType.Widget.canUpdate
benchmarks representative of what I saw at the time in Flutter's Widget.canUpdate
path, so I expect any improvements will show up on Golem.
Self-assigning this issue so it does not disappear again from the radar for another 6 years. I'll give it a try when I get cycles.
I worked on this and sent a CL for review that handles generic types in intrinsics for type equality and runtimeType comparison.
Generic types with equal class ids and equal type arguments are now considered equal by the intrinsics and a runtime call is avoided.
With the test program above in precompiled mode on my workstation, instead of getting:
Time taken = 4
Time taken with parameterized types = 331
I now get this:
Time taken = 4
Time taken with parameterized types = 5
I have reverted the landed CL due to a bug https://github.com/flutter/flutter/issues/84813
While investigating the problem I have also noticed that we seem to have another bug in the optimization:
class A<T, U> {
B<T> get b => B<T>();
}
class B<T> {}
@pragma('vm:never-inline')
bool test(Object a, Object b) {
print(a.runtimeType);
print(b.runtimeType);
return a.runtimeType == b.runtimeType;
}
void main() {
print(test(B<int>(), A<int, String>().b));
}
Prints false
even with CL reverted. I think this is the problem that https://dart-review.googlesource.com/c/sdk/+/45340/ was supposed to fix, but that CL did not actually include a code path that tested a.runtimeType == b.runtimeType
pattern, so I think we missed that it did not actually fix it for some reason.
The following standalone example exhibits the problem:
verifyTypes completes in 14 milliseconds while verifyGenericTypes takes 190-200 milliseconds.
@crelier