Two intertwined optimizations in this change:
1) Use GEPs for "in bounds" address calculation, represented as GT_LEA in IR (we're bending the general semantics a bit, but it's ok). The definition of "in bounds" is (necessarily) the same as LLVM's. The primary benefit of this is that we can see the folding of address computation into WASM's addressing modes. Contributes to #2256.
2) Use the LEA support to enable a more efficient form of null-checking in case we know the underlying object is either allocated, or literal "null". For now, this is only true for TYP_REF objects, as both byrefs and pointers can be "almost zero" (the former due to RyuJit's own, somewhat confusing, "based on" pointers model, the latter - because they have no special semantics).
The end result is these nice diffs:
Summary of Code Size diffs:
(Lower is better)
Total bytes of base: 3381138
Total bytes of diff: 3164944
Total bytes of delta: -216194 (-6.39% % of base)
Average relative delta: -12.00%
diff is an improvement
average relative diff is an improvement
Top method regressions (percentages):
222 (472.34% of base) : 3271.dasm - S_P_CoreLib_System_Collections_Generic_Dictionary_2<System___Canon__IntPtr>__System_Collections_IEnumerable_GetEnumerator
114 (80.85% of base) : 4066.dasm - S_P_CoreLib_System_Globalization_CultureInfo__GetFormat
285 (35.19% of base) : 3454.dasm - String__Equals_1
211 (34.14% of base) : 1457.dasm - S_P_CoreLib_System_DateTimeFormat__ExpandStandardFormatToCustomPattern
217 (25.29% of base) : 2051.dasm - S_P_CoreLib_System_Globalization_HebrewCalendar__GetDatePart
13 (10.57% of base) : 2385.dasm - S_P_CoreLib_System_Reflection_Runtime_Dispensers_DefaultDispenserPolicy__GetAlgorithm
40 ( 6.38% of base) : 3510.dasm - String__Concat_12
8 ( 3.88% of base) : 1145.dasm - S_P_CoreLib_System_Reflection_CustomAttributeNamedArgument__get_ArgumentType
3 ( 2.11% of base) : 2225.dasm - S_P_TypeLoader_Internal_Runtime_TypeLoader_TypeLoaderEnvironment__InitializeInstance
2 ( 2.11% of base) : 5323.dasm - S_P_CoreLib_System_Type__get_DefaultBinder
4 ( 1.68% of base) : 4902.dasm - S_P_CoreLib_Internal_Runtime_MethodTable__get_GenericVariance
4 ( 1.68% of base) : 2001.dasm - S_P_TypeLoader_Internal_Runtime_MethodTable__get_GenericVariance
6 ( 1.46% of base) : 5193.dasm - S_P_CoreLib_System_Runtime_TypeCast__CheckCast
2 ( 1.42% of base) : 4925.dasm - S_P_CoreLib_System_Runtime_InteropServices_SafeHandle__DangerousAddRef
2 ( 1.10% of base) : 3780.dasm - S_P_CoreLib_System_Threading_ManagedThreadId__RecycleId
2 ( 0.76% of base) : 5602.dasm - S_P_CoreLib_System_Decimal__ToByte
2 ( 0.76% of base) : 5603.dasm - S_P_CoreLib_System_Decimal__ToUInt16
Top method improvements (percentages):
-32 (-41.03% of base) : 1057.dasm - S_P_CoreLib_System_Threading_Thread___cctor
-155 (-38.94% of base) : 3788.dasm - S_P_CoreLib_System_Globalization_TextInfo__NeedsTurkishCasing
-184 (-38.33% of base) : 3791.dasm - S_P_CoreLib_System_Globalization_TextInfo__PopulateIsAsciiCasingSameAsInvariant
-32 (-38.10% of base) : 1069.dasm - S_P_CoreLib_System_Environment___cctor
-25 (-37.31% of base) : 4147.dasm - S_P_CoreLib_System_Reflection_Runtime_Assemblies_NativeFormat_NativeFormatRuntimeAssembly__CreateCaseInsensitiveTypeDictionary$F1_Fault
-25 (-37.31% of base) : 3368.dasm - S_P_CoreLib_Internal_Reflection_Extensions_NonPortable_CustomAttributeInstantiator__Instantiate$F1_Fault
-33 (-37.08% of base) : 4273.dasm - S_P_CoreLib_System_Text_StringBuilder__Append_1
-24 (-36.36% of base) : 2286.dasm - S_P_CoreLib_System_Buffers_SharedArrayPool_1<Char>__Trim$F1_Fault
-24 (-36.36% of base) : 2273.dasm - S_P_CoreLib_System_TimeZoneInfo__CompareTimeZoneFile$F1_Finally
-24 (-36.36% of base) : 1196.dasm - S_P_CoreLib_System_Reflection_Runtime_TypeInfos_RuntimeTypeInfo__get_ImplementedInterfaces$F1_Fault
-24 (-36.36% of base) : 2239.dasm - S_P_CoreLib_System_Reflection_Runtime_Assemblies_NativeFormat_NativeFormatRuntimeAssembly__UncachedGetTypeCoreCaseSensitive$F1_Finally
-24 (-36.36% of base) : 2857.dasm - S_P_CoreLib_System_Buffers_SharedArrayPool_1<Int32>__Trim$F1_Fault
-24 (-36.36% of base) : 5713.dasm - S_P_TypeLoader_Internal_TypeSystem_CastingHelper__IsConstrainedAsGCPointer$F1_Finally
-24 (-36.36% of base) : 2284.dasm - S_P_CoreLib_System_Buffers_SharedArrayPool_1<UInt8>__Trim$F2_Fault
-24 (-36.36% of base) : 2287.dasm - S_P_CoreLib_System_Buffers_SharedArrayPool_1<Char>__Trim$F2_Fault
-24 (-36.36% of base) : 2858.dasm - S_P_CoreLib_System_Buffers_SharedArrayPool_1<Int32>__Trim$F2_Fault
-24 (-36.36% of base) : 2283.dasm - S_P_CoreLib_System_Buffers_SharedArrayPool_1<UInt8>__Trim$F1_Fault
-24 (-36.36% of base) : 2713.dasm - S_P_TypeLoader_Internal_Runtime_TypeLoader_TypeLoaderEnvironment__RegisterDynamicGenericTypesAndMethods$F2_Fault
-24 (-36.36% of base) : 2143.dasm - S_P_CoreLib_System_Reflection_Runtime_General_Helpers__GetRawConstant$F1_Finally
-24 (-36.36% of base) : 4604.dasm - S_P_CoreLib_System_IO_File__ReadAllBytes$F1_Fault
4748 total methods with Code Size differences (4731 improved, 17 regressed)
Two intertwined optimizations in this change: 1) Use GEPs for "in bounds" address calculation, represented as
GT_LEA
in IR (we're bending the general semantics a bit, but it's ok). The definition of "in bounds" is (necessarily) the same as LLVM's. The primary benefit of this is that we can see the folding of address computation into WASM's addressing modes. Contributes to #2256. 2) Use the LEA support to enable a more efficient form of null-checking in case we know the underlying object is either allocated, or literal "null". For now, this is only true forTYP_REF
objects, as both byrefs and pointers can be "almost zero" (the former due to RyuJit's own, somewhat confusing, "based on" pointers model, the latter - because they have no special semantics).The end result is these nice diffs: