Open mikedn opened 7 years ago
Simple example of trying and failing:
static int Test(int i, int[] a) {
if (i < 0 || i >= a.Length)
return 0;
return a[i];
}
It seems that RangeCheck should be capable of removing the unnecessary range check in the above example because it's capable of merging assertions generated by assertion propagation:
[RangeCheck::GetRange] BB04N001 ( 1, 1) [000020] ------------ * LCL_VAR int V00 arg0 u:2 $80
{
Merging assertions from pred edges of BB04 for op(000001AE2C3EB200) $080
Constant Assertion: (256, 64) ($100,$40) Loop_Bnd {LT($80, $40)} is {IntCns 0} index=#02, mask=0000000000000002
Constant Assertion: (256, 64) ($100,$40) Loop_Bnd {LT($80, $40)} is {IntCns 0} index=#02, mask=0000000000000002
The range after edge merging:<0, Unknown>
Constant Assertion: (257, 64) ($101,$40) Loop_Bnd { {InitVal($40)} LT {ARR_LENGTH($c0)}} is not {IntCns 0} index=#04, mask=0000000000000008
Constant Assertion: (257, 64) ($101,$40) Loop_Bnd { {InitVal($40)} LT {ARR_LENGTH($c0)}} is not {IntCns 0} index=#04, mask=0000000000000008
The range after edge merging:<0, VN0082 + -1>
Computed Range (2C3EB200) => <0, VN0082 + -1>
}
It correctly figures out that the range is [0, a.Length) yet it fails to remove the range check because:
Does overflow 000001AE2C3EB200?
Method determined to overflow.
What overflow?! It's obvious that i
isn't the result of an operation that can overflow but RangeCheck tries to determine that anyway and fails. At least when i
is an argument (VNF_InitVal) it should conclude that it doesn't overflow, just like it already does for constants.
I need to look into this more but it seems that for certain types of ranges checking for overflow (and the subsequent monotonoy check) is simply not necessary, no matter how the value is produced.
cc @dotnet/jit-contrib @russellhadley
Disabling RangeCheck results in the following regression:
Total bytes of diff: 23715 (0.18% of base)
diff is a regression.
Total byte diff includes 0 bytes from reconciling methods
Base had 0 unique methods, 0 unique bytes
Diff had 0 unique methods, 0 unique bytes
Top file regressions by size (bytes):
6951 : System.Private.CoreLib.dasm (0.21% of base)
2224 : Microsoft.CodeAnalysis.VisualBasic.dasm (0.10% of base)
2193 : Microsoft.CodeAnalysis.CSharp.dasm (0.10% of base)
2187 : Microsoft.CodeAnalysis.dasm (0.29% of base)
2074 : Microsoft.CSharp.dasm (0.56% of base)
40 total files with size differences (0 improved, 40 regressed), 39 unchanged.
Top method regessions by size (bytes):
1102 : Microsoft.CSharp.dasm - PredefinedMembers:.cctor()
938 : System.Text.RegularExpressions.dasm - RegexCharClass:.cctor()
710 : Microsoft.CodeAnalysis.dasm - AttributeDescription:.cctor()
675 : Microsoft.CodeAnalysis.CSharp.dasm - BuiltInOperators:GetSimpleBuiltInOperators(int,ref):this (2 methods)
671 : System.Private.CoreLib.dasm - Dictionary`2:OnDeserialization(ref):this (29 methods)
789 total methods with size differences (0 improved, 789 regressed), 64772 unchanged.
Hard to say if a 23715 bytes regression is large or not compared to the amount of work RangeCheck does. It would be nice to get a percentage of eliminated range checks but for code size should do.
Now I need to check how many of these missed range checks can be eliminated earlier.
One set of range checks that RangeCheck come from impInitializeArrayIntrinsic
. For whatever reasons it uses GT_INDEX
to obtain the address of the first array element - https://github.com/dotnet/coreclr/blob/master/src/jit/importer.cpp#L3270.
This seems like overkill. GT_INDEX
will end up being morphed (which generates a bunch of IR) and then the range check needs to be eliminated. There's no need for range check to begin with since the transform is done only when the array length is known and it's large enough to hold the initialization data.
If the code is changed to obtain the element address by simply adding the array data offset then the above diff changes to:
Total bytes of diff: 18947 (0.15% of base)
diff is a regression.
Total byte diff includes 0 bytes from reconciling methods
Base had 0 unique methods, 0 unique bytes
Diff had 0 unique methods, 0 unique bytes
Top file regressions by size (bytes):
6223 : System.Private.CoreLib.dasm (0.19% of base)
1851 : Microsoft.CodeAnalysis.CSharp.dasm (0.09% of base)
1752 : Microsoft.CodeAnalysis.VisualBasic.dasm (0.08% of base)
1389 : Microsoft.CodeAnalysis.dasm (0.18% of base)
981 : System.Text.RegularExpressions.dasm (1.03% of base)
38 total files with size differences (0 improved, 38 regressed), 41 unchanged.
Top method regessions by size (bytes):
938 : System.Text.RegularExpressions.dasm - RegexCharClass:.cctor()
671 : System.Private.CoreLib.dasm - Dictionary`2:OnDeserialization(ref):this (29 methods)
540 : System.Reflection.Metadata.dasm - MetadataReader:InitializeProjectedTypes()
510 : System.Reflection.DispatchProxy.dasm - ProxyBuilder:.cctor()
435 : Microsoft.CodeAnalysis.CSharp.dasm - BuiltInOperators:GetSimpleBuiltInOperators(int,ref):this (2 methods)
Top method improvements by size (bytes):
-47 : Microsoft.CodeAnalysis.dasm - DesktopAssemblyIdentityComparer:.cctor()
-43 : System.Security.Principal.Windows.dasm - WindowsIdentity:.ctor(ref):this (2 methods)
703 total methods with size differences (2 improved, 701 regressed), 64858 unchanged.
So ~20% range checks eliminated by RangeCheck should not exist in the first place.
Another set of range checks comes from earlyprop, it propagates constant array lengths and when the array index is also constant we end up with range checks like GT_ARR_BOUNDS_CHECK(2, 3)
. Nothing until RangeCheck removes these useless checks, assertionprop even generates pointless assertions out of them.
Maybe these should be eliminated during earlyprop but for the sake of experimentation I changed assertionprop to get rid of them, it's easier.
Now the diff is:
Total bytes of diff: 15735 (0.12% of base)
diff is a regression.
Total byte diff includes 0 bytes from reconciling methods
Base had 0 unique methods, 0 unique bytes
Diff had 0 unique methods, 0 unique bytes
Top file regressions by size (bytes):
3125 : System.Private.CoreLib.dasm (0.10% of base)
1795 : Microsoft.CodeAnalysis.CSharp.dasm (0.08% of base)
1752 : Microsoft.CodeAnalysis.VisualBasic.dasm (0.08% of base)
1357 : Microsoft.CodeAnalysis.dasm (0.18% of base)
981 : System.Text.RegularExpressions.dasm (1.03% of base)
38 total files with size differences (0 improved, 38 regressed), 41 unchanged.
Top method regessions by size (bytes):
938 : System.Text.RegularExpressions.dasm - RegexCharClass:.cctor()
671 : System.Private.CoreLib.dasm - Dictionary`2:OnDeserialization(ref):this (29 methods)
540 : System.Reflection.Metadata.dasm - MetadataReader:InitializeProjectedTypes()
510 : System.Reflection.DispatchProxy.dasm - ProxyBuilder:.cctor()
435 : Microsoft.CodeAnalysis.CSharp.dasm - BuiltInOperators:GetSimpleBuiltInOperators(int,ref):this (2 methods)
Top method improvements by size (bytes):
-52 : System.Private.CoreLib.dasm - UnicodeEncoding:GetChars(long,int,long,int,ref):int:this
-51 : System.Private.CoreLib.dasm - UnicodeEncoding:GetCharCount(long,int,ref):int:this
-47 : Microsoft.CodeAnalysis.dasm - DesktopAssemblyIdentityComparer:.cctor()
-43 : System.Security.Principal.Windows.dasm - WindowsIdentity:.ctor(ref):this (2 methods)
-28 : System.Private.CoreLib.dasm - UTF32Encoding:GetChars(long,int,long,int,ref):int:this
665 total methods with size differences (10 improved, 655 regressed), 64896 unchanged.
Now the number of unnecessary range checks reaches ~33%.
Hard to say if a 23715 bytes regression is large or not compared to the amount of work RangeCheck does
Are there perf tests for that?
Are there perf tests for that?
We'll see. I still have ~66% range checks to look at to tell what exactly RangeCheck is really good at :)
Another potentially interesting failure:
int i = ...;
while ((uint)i < (uint)a.Length)
i = a[i];
The loop is transformed into a do/while loop and 2 OAK_NO_THROW
assertions are generated. Assertion prop doesn't merge them at PHI and RangeCheck ignores them completely so the unnecessary range check is not eliminated.
While not exactly common this pattern can be useful when you have a linked list implemented using array indices rather than objects references (e.g. in Dictionary<K, V>
). In such cases the end of the list is usually signaled by a -1 index and instead of checking for -1 one can use an unsigned check like above.
Another type of range checks that RangeCheck eliminates look like the following:
obj.field = new byte[2];
obj.field[0] = 4;
obj.field[1] = 2;
Early prop runs very early (before VN and CSE) and has no way to propagate the array length so it's up to RangeCheck to figure it out.
Maybe assertion prop could extract as OAK_NO_THROW
assertion from the array creation? Even if it can it's hard to tell if moving work from range check to assertion prop is useful.
Fun case:
var a = new int[x];
for (int i = 0; i < a.Length; i++)
s += a[i];
Range check eliminated by loop cloning - after deriving loop cloning conditions it figures out that they're always true and eliminates the range check without actually cloning the loop. Good.
Same code but with byte[]
instead of int[]
- loop cloning fails and RangeCheck gets to eliminate the range check. Loop cloning fails because optExtractArrIndex
looks for a LSH(index, constant)
but there's none for byte arrays.
So aside special cases like byte[]
loop cloning takes care of the typical case:
for (int i = 0; i < a.Length; i++)
s += a[i];
when a
is a local variable. RangeCheck helps again with the case when the array is field:
for (int i = 0; i < obj.field.Length; i++)
s += obj.field[i];
where loop cloning can't do anything because it runs too early. Which is kind of strange, why increase the IR size before running other optimizations?
Some potentially interesting RangeCheck stats (from corelib crossgen):
Name | Value |
---|---|
Range checks | 5687 |
Removed range checks | 295 |
Range - unknown | 3572 |
Range - overflow | 1036 |
Range widen - unknown | 4 |
Range widen - out of bounds | 780 |
Max search depth | 54 (out of 100) |
Max used budget | 1481 (out of 8192) |
Total time | 36ms (out of 6860ms) |
Map def time | 15ms |
Total memory | 9.2 Mbytes |
Max memory per method | 268 Kbytes |
Numbers do not include range checks that involve constants and do no require range computation.
The percentage of removed range checks is pretty low but that's not surprising considering that range checks are also removed in previous phases (earlyprop, assertionprop, loop cloning).
Both the search depth and budget are well below hard coded limits but the search depth is surprisingly deep.
It seems to me that
RangeCheck
tries too hard and sometimes fails too easily. And we also seem to have cases where range checks reach this phase (and they are eliminated) but they could have been eliminated earlier at a lower cost.I'm taking a look and I'll use this issue to collect my observations.
category:cq theme:range-checks skill-level:expert cost:large