Open aam opened 1 year ago
Reduced to
mport "dart:collection";
import "dart:typed_data";
RuneIterator var76 = RuneIterator.at("va", 28);
@pragma("vm:never-inline")
Uint16List hide() => Uint16List(1);
void foo3_Extension1(List<bool> par2) {
try {
for (int loc0 in hide()) {
par2[-9223372032559808514]!;
}
} catch (e, st) {}
}
void foo4_Extension1() {
if (var76.movePrevious()) {
for (int loc0 in Uint16List(15)) {
}
}
}
main() {
foo3_Extension1(<bool>[false, false, false, false, false]);
try {
foo4_Extension1();
} catch (e, st) {}
}
Which for some reason incorrectly eliminates the bounds check for par2.
;; v33 <- LoadField:42(v16 T{_GrowableList} . GrowableObjectArray.data) T{_List}
0x7feb17819539 4c8b4f17 movq r9,[rdi+0x17]
;; ParallelMove r10 <- C
0x7feb1781953d 4d8b971f4c0000 movq r10,[pp+0x4c1f] -9223372032559808514
;; v34 <- LoadIndexed:42(v33, v144 T{_Mint}) T{bool}
0x7feb17819544 4f8b649117 movq r12,[r9+r10*4+0x17] ***SEGV
@rmacnak-google created a CL to fix this (https://dart-review.googlesource.com/c/sdk/+/327282), but I'd like to discuss the problem and possible solutions in more details.
This bug shows the weakness of representing control dependencies as data dependencies.
In the reduced repro, CSE replaces BoxInt64
with another BoxInt64
, bypassing intermediate GenericCheckBound
. As a result, this breaks dependency between LoadIndexed
and GenericCheckBound
and enables subsequent LICM of LoadIndexed
instruction above GenericCheckBound
:
v129 <- BoxInt64(v60)
CheckNull:30(v129 T{int}, ArgumentError) T{int}
v118 <- GenericCheckBound:30(v130 T{_Smi}, v60) T{int}
v131 <- BoxInt64(v118)
v119 <- LoadIndexed(v17, v131 T{int}) T{_Smi}
CSE =>
v125 <- BoxInt64(v122) T{int}
CheckNull:38(v125 T{_Mint}, ArgumentError) T{_Mint}
v32 <- GenericCheckBound:38(v132 T{_Smi}, v122) T{_Mint}
v34 <- LoadIndexed(v33, v125 T{_Mint}) T{bool}
Currently CSE checks inputs in the following way when comparing two candidate instructions for CSE:
OriginalDefinition()
of inputs must be the same.The problem is that CSE could occur for any redefinition-like instruction such as box/unbox which participates in the dependency chain between null/bound/type check and a dependent load (LoadField/LoadIndexed).
One way to fix this would be to drop OriginalDefinition()
entirely and only do CSE if inputs are the same for all instructions. This would let us keep control-as-data dependencies, but could be too conservative and regress performance. Can we measure this option on Golem to see how much it affects performance?
We can also have a list of instructions which may participate in the control-as-data dependency chains, and drop OriginalDefinition()
from CSE only for those instructions. They would probably include at least LoadField
, LoadIndexed
, all boxing and unboxing instructions, and maybe redefinitions. This would more more efficient but more fragile and I'm not sure it would cover all possible current and future cases where CSE would break control-as-data dependencies.
Another option is to introduce separate explicit control dependencies between instructions and take them into account in LICM. This way we can freely do CSE while still keeping dependencies between checks (such as GenericCheckBound
) and dependent loads (such as LoadIndexed
). In addition, we might be able to replace certain Redefinition
instructions with more explicit control dependencies. The downside is that this solution is much more involved and would make our IL more complex and probably harder to work with.
Any other ideas?
@mraleph @mkustermann WDYT?
CSE comparing inputs without OriginalDefinition:
https://dart-review.googlesource.com/c/sdk/+/327780
<golem>/Revision?repository=dart&revision=106249&patch=18546
log