Open BruceForstall opened 3 years ago
The inner k
loop is not cloned due to a heuristic to avoid creating too many "checking" blocks in Compiler::optComputeDerefConditions
.
After bumping that heuristic to allow the inner loop to be cloned, we still don't remove one of the bounds checks.
Why we can't eliminate the final bounds check in the inner loop: when the importer imports the d[i][j][k] = s[i][j][k]
line, it needs to preserve proper exception ordering. That means doing the d[i][j]
array bounds check first, then s[i][j][k]
, then finally the d[i][j][k]
bounds check before the assign. To do this, it introduces a temp to store d[i][j]
, and reuses that temp at the assign. This temp is marked both "Strict ordering of exceptions for Array store" and a "single def temp". When loop cloning goes to reconstruct the array index expressions, it sees the temp, which doesn't appear to be loop-invariant, because it is assigned in the loop. Also, its assignment is in another tree. Thus, loop cloning won't touch the "store" array bounds check, even though it really could be added to the cloning condition if we were somehow smarter, such as knowing it were single-def, and finding that def, looking for the array index pattern there.
All the issues have been investigated. While more improvements could be implemented, there are none planned, so I'm closing this.
The issue with an extra bounds check for nested array access that loop cloning can't get rid of should be investigated. It happens also in other test cases with multi-dimensional (jagged) arrays and nested loops over the array indices.
The core of the src\tests\JIT\Performance\CodeQuality\Benchstones\BenchI\Array2\Array2.cs benchmark is this loop:
Some observations:
i
andj
loops are cloned. The innerk
loop is not cloned. Why?category:cq theme:loop-opt skill-level:expert cost:medium impact:medium