dotnet / runtimelab

This repo is for experimentation and exploring new ideas that may or may not make it into the main dotnet/runtime repo.
MIT License
1.38k stars 192 forks source link

[NativeAOT-LLVM] Fold shadow stack save into the PI transition #2460

Closed SingleAccretion closed 8 months ago

SingleAccretion commented 8 months ago

At the time when PI support was added, we weren't able to use LLVM-specific helpers in IR, but that limitation has since been lifted, and this change fixes two TODOs: one in codegen directly related to the workaround, and another, CQ-related, about the fact that we should use the already-present RhpPInvoke call to save the shadow stack.

The diffs are positive modulo LLVM deciding to inline the world into the GC test:

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 3291173
Total bytes of diff: 3293565
Total bytes of delta: 2392 (0.07% % of base)
Average relative delta: 0.40%
    diff is a regression
    average relative diff is a regression

Top method regressions (percentages):
        3233 (351.41% of base) : 1005.dasm - HelloWasm_Program__TestGC
         381 (68.16% of base) : 1006.dasm - HelloWasm_Program__TestConstrainedClassCalls
         196 (53.26% of base) : 1084.dasm - S_P_CoreLib_System_CrashInfo__WriteHexValue_0
         103 (46.40% of base) : 1106.dasm - S_P_CoreLib_System_CrashInfo__WriteValue_0
           8 (33.33% of base) : 1000.dasm - RhpPInvoke
           6 ( 0.48% of base) : 1085.dasm - S_P_CoreLib_System_CrashInfo__WriteStringValue

Top method improvements (percentages):
        -161 (-55.90% of base) : 1054.dasm - S_P_CoreLib_System_Reflection_Runtime_BindingFlagSupport_NameFilterCaseInsensitive__Matches_0
         -56 (-30.27% of base) : 1101.dasm - HelloWasm_Program__RootFuncDup
         -53 (-14.60% of base) : 1072.dasm - S_P_CoreLib_Interop_Sys__Stat
         -53 (-14.60% of base) : 1061.dasm - S_P_CoreLib_Interop_Sys__GetTimeZoneData
         -37 (-12.17% of base) : 1086.dasm - S_P_CoreLib_Interop_Globalization__GetSortHandle
         -32 (-9.09% of base) : 1049.dasm - S_P_CoreLib_Interop_Sys__OpenDir
         -26 (-8.64% of base) : 1028.dasm - S_P_CoreLib_Interop_Sys__Unlink
         -34 (-7.46% of base) : 1055.dasm - S_P_CoreLib_Interop_Sys__FStat
          -6 (-7.14% of base) : 1087.dasm - S_P_CoreLib_Interop_Globalization__CloseSortHandle
          -6 (-7.14% of base) : 1040.dasm - S_P_CoreLib_Interop_Sys__Free
          -6 (-7.14% of base) : 1018.dasm - S_P_CoreLib_Interop_Sys__LowLevelMonitor_Release
          -6 (-7.14% of base) : 1109.dasm - S_P_CoreLib_System_Threading_Thread__LongSpinWait
         -18 (-7.00% of base) : 1045.dasm - S_P_CoreLib_System_Threading_LowLevelLock__SignalWaiter
          -6 (-6.98% of base) : 1096.dasm - S_P_CoreLib_System_Runtime_InternalCalls__RhEndNoGCRegion
         -26 (-6.91% of base) : 1073.dasm - S_P_CoreLib_Interop_Sys__FLock
         -26 (-6.88% of base) : 1043.dasm - S_P_CoreLib_Interop_Sys__Read
         -26 (-6.88% of base) : 1041.dasm - S_P_CoreLib_Interop_Sys__LSeek
         -26 (-6.88% of base) : 1071.dasm - S_P_CoreLib_Interop_Sys__FAllocate
         -26 (-6.84% of base) : 1042.dasm - S_P_CoreLib_Interop_Sys__PRead
          -6 (-6.82% of base) : 1098.dasm - S_P_CoreLib_System_Runtime_InternalCalls__RhGetGcTotalMemory

110 total methods with Code Size differences (104 improved, 6 regressed)
- 20 04                      |     local.get 4
- 41 0c                      |     i32.const 12
- 6a                         |     i32.add
- 10 c9 81 80 80 00          |     call 201 <RhpPInvoke>
  20 00                      |     local.get 0
  41 10                      |     i32.const 16
  6a                         |     i32.add
  22 05                      |     local.tee 5
- 10 ac 86 80 80 00          |     call 812 <RhpSetShadowStackTop>
+ 20 04                      |     local.get 4
+ 41 0c                      |     i32.const 12
+ 6a                         |     i32.add
+ 10 af 86 80 80 00          |     call 815 <RhpPInvoke>
  20 01                      |     local.get 1
  20 02                      |     local.get 2
  41 d0 00                   |     i32.const 80
@@ -43,7 +42,7 @@ func[875] <S_P_CoreLib_System_GC___AllocateUninitializedArray_g__AllocateNewUnin
SingleAccretion commented 8 months ago

@dotnet/nativeaot-llvm