MihuBot / runtime-utils

0 stars 0 forks source link

[X64] [xtqqczze] Reduce IL size for `BitConverter.GetBytes` #175

Open MihuBot opened 1 year ago

MihuBot commented 1 year ago

Build completed in 1 hour 17 minutes. https://github.com/dotnet/runtime/pull/91639

CoreLib diffs

Found 2 files with textual diffs.

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 6605438
Total bytes of diff: 6605377
Total bytes of delta: -61 (-0.00 % of base)
Total relative delta: -1.50
    diff is an improvement.
    relative diff is an improvement.

Top file improvements (bytes):
         -61 : System.Private.CoreLib.dasm (-0.00 % of base)

1 total files with Code Size differences (1 improved, 0 regressed), 0 unchanged.

Top method improvements (bytes):
         -26 (-56.52 % of base) : System.Private.CoreLib.dasm - System.BitConverter:GetBytes(float):ubyte[] (FullOpts)
         -23 (-52.27 % of base) : System.Private.CoreLib.dasm - System.BitConverter:GetBytes(double):ubyte[] (FullOpts)
         -12 (-41.38 % of base) : System.Private.CoreLib.dasm - System.BitConverter:GetBytes(System.Half):ubyte[] (FullOpts)

Top method improvements (percentages):
         -26 (-56.52 % of base) : System.Private.CoreLib.dasm - System.BitConverter:GetBytes(float):ubyte[] (FullOpts)
         -23 (-52.27 % of base) : System.Private.CoreLib.dasm - System.BitConverter:GetBytes(double):ubyte[] (FullOpts)
         -12 (-41.38 % of base) : System.Private.CoreLib.dasm - System.BitConverter:GetBytes(System.Half):ubyte[] (FullOpts)

3 total methods with Code Size differences (3 improved, 0 regressed), 52946 unchanged.

--------------------------------------------------------------------------------

Frameworks diffs

Diffs ``` Found 261 files with textual diffs. Summary of Code Size diffs: (Lower is better) Total bytes of base: 37857688 Total bytes of diff: 37857695 Total bytes of delta: 7 (0.00 % of base) Total relative delta: -0.62 diff is a regression. relative diff is an improvement. Top file regressions (bytes): 124 : Microsoft.VisualBasic.Core.dasm (0.02 % of base) 28 : ILCompiler.Reflection.ReadyToRun.dasm (0.01 % of base) Top file improvements (bytes): -84 : System.Data.Common.dasm (-0.01 % of base) -61 : System.Private.CoreLib.dasm (-0.00 % of base) 4 total files with Code Size differences (2 improved, 2 regressed), 252 unchanged. Top method regressions (bytes): 71 (38.59 % of base) : Microsoft.VisualBasic.Core.dasm - Microsoft.VisualBasic.VBMath:Rnd(float):float (FullOpts) 54 (45.76 % of base) : Microsoft.VisualBasic.Core.dasm - Microsoft.VisualBasic.VBMath:Randomize() (FullOpts) 18 (1.58 % of base) : ILCompiler.Reflection.ReadyToRun.dasm - ILCompiler.Reflection.ReadyToRun.x86.GcInfo:GetTransitionsNoEbp(ubyte[],byref):this (FullOpts) 10 (8.77 % of base) : ILCompiler.Reflection.ReadyToRun.dasm - ILCompiler.Reflection.ReadyToRun.x86.CallPattern:DecodeCallPattern(uint,byref,byref,byref,byref) (FullOpts) Top method improvements (bytes): -84 (-5.99 % of base) : System.Data.Common.dasm - System.Data.Common.ObjectStorage:Set(int,System.Object):this (FullOpts) -26 (-56.52 % of base) : System.Private.CoreLib.dasm - System.BitConverter:GetBytes(float):ubyte[] (FullOpts) -23 (-52.27 % of base) : System.Private.CoreLib.dasm - System.BitConverter:GetBytes(double):ubyte[] (FullOpts) -12 (-41.38 % of base) : System.Private.CoreLib.dasm - System.BitConverter:GetBytes(System.Half):ubyte[] (FullOpts) -1 (-0.56 % of base) : Microsoft.VisualBasic.Core.dasm - Microsoft.VisualBasic.VBMath:Randomize(double) (FullOpts) Top method regressions (percentages): 54 (45.76 % of base) : Microsoft.VisualBasic.Core.dasm - Microsoft.VisualBasic.VBMath:Randomize() (FullOpts) 71 (38.59 % of base) : Microsoft.VisualBasic.Core.dasm - Microsoft.VisualBasic.VBMath:Rnd(float):float (FullOpts) 10 (8.77 % of base) : ILCompiler.Reflection.ReadyToRun.dasm - ILCompiler.Reflection.ReadyToRun.x86.CallPattern:DecodeCallPattern(uint,byref,byref,byref,byref) (FullOpts) 18 (1.58 % of base) : ILCompiler.Reflection.ReadyToRun.dasm - ILCompiler.Reflection.ReadyToRun.x86.GcInfo:GetTransitionsNoEbp(ubyte[],byref):this (FullOpts) Top method improvements (percentages): -26 (-56.52 % of base) : System.Private.CoreLib.dasm - System.BitConverter:GetBytes(float):ubyte[] (FullOpts) -23 (-52.27 % of base) : System.Private.CoreLib.dasm - System.BitConverter:GetBytes(double):ubyte[] (FullOpts) -12 (-41.38 % of base) : System.Private.CoreLib.dasm - System.BitConverter:GetBytes(System.Half):ubyte[] (FullOpts) -84 (-5.99 % of base) : System.Data.Common.dasm - System.Data.Common.ObjectStorage:Set(int,System.Object):this (FullOpts) -1 (-0.56 % of base) : Microsoft.VisualBasic.Core.dasm - Microsoft.VisualBasic.VBMath:Randomize(double) (FullOpts) 9 total methods with Code Size differences (5 improved, 4 regressed), 232293 unchanged. -------------------------------------------------------------------------------- ```

Artifacts:

MihuBot commented 1 year ago

Top method improvements

-26 (-56.52 % of base) - System.BitConverter:GetBytes(float):ubyte[] ```diff ; Assembly listing for method System.BitConverter:GetBytes(float):ubyte[] (FullOpts) ; Emitting BLENDED_CODE for X64 with AVX - Unix ; FullOpts code ; optimized code ; rsp based frame -; partially interruptible +; fully interruptible ; No PGO data ; Final local variable assignments ; -; V00 arg0 [V00,T01] ( 3, 3 ) float -> [rsp+0x04] single-def -; V01 loc0 [V01,T00] ( 3, 3 ) ref -> rax class-hnd exact single-def -;# V02 OutArgs [V02 ] ( 1, 1 ) struct ( 0) [rsp+0x00] do-not-enreg[XS] addr-exposed "OutgoingArgSpace" +; V00 arg0 [V00,T00] ( 3, 3 ) float -> mm0 single-def +;# V01 OutArgs [V01 ] ( 1, 1 ) struct ( 0) [rsp+0x00] do-not-enreg[XS] addr-exposed "OutgoingArgSpace" ; -; Lcl frame size = 8 +; Lcl frame size = 0 G_M32905_IG01: - push rax vzeroupper - vmovss dword ptr [rsp+0x04], xmm0 - ;; size=10 bbWeight=1 PerfScore 3.00 + ;; size=3 bbWeight=1 PerfScore 1.00 G_M32905_IG02: - mov rdi, 0xD1FFAB1E ; ubyte[] - mov esi, 4 - call CORINFO_HELP_NEWARR_1_VC - vmovss xmm0, dword ptr [rsp+0x04] - vmovss dword ptr [rax+0x10], xmm0 - ;; size=31 bbWeight=1 PerfScore 6.50 + vmovd edi, xmm0 + mov rax, 0xD1FFAB1E ; code for System.BitConverter:GetBytes(int):ubyte[] + ;; size=14 bbWeight=1 PerfScore 2.25 G_M32905_IG03: - add rsp, 8 - ret - ;; size=5 bbWeight=1 PerfScore 1.25 + tail.jmp [rax]System.BitConverter:GetBytes(int):ubyte[] + ;; size=3 bbWeight=1 PerfScore 2.00 -; Total bytes of code 46, prolog size 4, PerfScore 15.35, instruction count 10, allocated bytes for code 46 (MethodHash=41997f76) for method System.BitConverter:GetBytes(float):ubyte[] (FullOpts) +; Total bytes of code 20, prolog size 3, PerfScore 7.25, instruction count 4, allocated bytes for code 20 (MethodHash=41997f76) for method System.BitConverter:GetBytes(float):ubyte[] (FullOpts) ```
-23 (-52.27 % of base) - System.BitConverter:GetBytes(double):ubyte[] ```diff ; Assembly listing for method System.BitConverter:GetBytes(double):ubyte[] (FullOpts) ; Emitting BLENDED_CODE for X64 with AVX - Unix ; FullOpts code ; optimized code ; rsp based frame -; partially interruptible +; fully interruptible ; No PGO data ; Final local variable assignments ; -; V00 arg0 [V00,T01] ( 3, 3 ) double -> [rsp+0x00] single-def -; V01 loc0 [V01,T00] ( 3, 3 ) ref -> rax class-hnd exact single-def -;# V02 OutArgs [V02 ] ( 1, 1 ) struct ( 0) [rsp+0x00] do-not-enreg[XS] addr-exposed "OutgoingArgSpace" +; V00 arg0 [V00,T00] ( 3, 3 ) double -> mm0 single-def +;# V01 OutArgs [V01 ] ( 1, 1 ) struct ( 0) [rsp+0x00] do-not-enreg[XS] addr-exposed "OutgoingArgSpace" ; -; Lcl frame size = 8 +; Lcl frame size = 0 G_M20108_IG01: - push rax vzeroupper - vmovsd qword ptr [rsp], xmm0 - ;; size=9 bbWeight=1 PerfScore 3.00 + ;; size=3 bbWeight=1 PerfScore 1.00 G_M20108_IG02: - mov rdi, 0xD1FFAB1E ; ubyte[] - mov esi, 8 - call CORINFO_HELP_NEWARR_1_VC - vmovsd xmm0, qword ptr [rsp] - vmovsd qword ptr [rax+0x10], xmm0 - ;; size=30 bbWeight=1 PerfScore 6.50 + vmovd rdi, xmm0 + mov rax, 0xD1FFAB1E ; code for System.BitConverter:GetBytes(long):ubyte[] + ;; size=15 bbWeight=1 PerfScore 2.25 G_M20108_IG03: - add rsp, 8 - ret - ;; size=5 bbWeight=1 PerfScore 1.25 + tail.jmp [rax]System.BitConverter:GetBytes(long):ubyte[] + ;; size=3 bbWeight=1 PerfScore 2.00 -; Total bytes of code 44, prolog size 4, PerfScore 15.15, instruction count 10, allocated bytes for code 44 (MethodHash=02c0b173) for method System.BitConverter:GetBytes(double):ubyte[] (FullOpts) +; Total bytes of code 21, prolog size 3, PerfScore 7.35, instruction count 4, allocated bytes for code 21 (MethodHash=02c0b173) for method System.BitConverter:GetBytes(double):ubyte[] (FullOpts) ```
-12 (-41.38 % of base) - System.BitConverter:GetBytes(System.Half):ubyte[] ```diff ; Assembly listing for method System.BitConverter:GetBytes(System.Half):ubyte[] (FullOpts) ; Emitting BLENDED_CODE for X64 with AVX - Unix ; FullOpts code ; optimized code ; rsp based frame -; partially interruptible +; fully interruptible ; No PGO data +; 0 inlinees with PGO data; 1 single block inlinees; 0 inlinees without PGO data ; Final local variable assignments ; ;* V00 arg0 [V00 ] ( 0, 0 ) struct ( 8) zero-ref single-def -; V01 loc0 [V01,T01] ( 3, 3 ) ref -> rax class-hnd exact single-def -;# V02 OutArgs [V02 ] ( 1, 1 ) struct ( 0) [rsp+0x00] do-not-enreg[XS] addr-exposed "OutgoingArgSpace" -; V03 tmp1 [V03,T00] ( 2, 2 ) ushort -> rbx single-def "field V00._value (fldOffset=0x0)" P-INDEP +;# V01 OutArgs [V01 ] ( 1, 1 ) struct ( 0) [rsp+0x00] do-not-enreg[XS] addr-exposed "OutgoingArgSpace" +; V02 tmp1 [V02,T00] ( 2, 2 ) ushort -> rdi single-def "field V00._value (fldOffset=0x0)" P-INDEP ; ; Lcl frame size = 0 G_M29041_IG01: - push rbx - mov ebx, edi - ;; size=3 bbWeight=1 PerfScore 1.25 + ;; size=0 bbWeight=1 PerfScore 0.00 G_M29041_IG02: - mov rdi, 0xD1FFAB1E ; ubyte[] - mov esi, 2 - call CORINFO_HELP_NEWARR_1_VC - mov word ptr [rax+0x10], bx - ;; size=24 bbWeight=1 PerfScore 2.50 + movsx rdi, di + mov rax, 0xD1FFAB1E ; code for System.BitConverter:GetBytes(short):ubyte[] + ;; size=14 bbWeight=1 PerfScore 0.50 G_M29041_IG03: - pop rbx - ret - ;; size=2 bbWeight=1 PerfScore 1.50 + tail.jmp [rax]System.BitConverter:GetBytes(short):ubyte[] + ;; size=3 bbWeight=1 PerfScore 2.00 -; Total bytes of code 29, prolog size 1, PerfScore 8.15, instruction count 8, allocated bytes for code 29 (MethodHash=95fd8e8e) for method System.BitConverter:GetBytes(System.Half):ubyte[] (FullOpts) +; Total bytes of code 17, prolog size 0, PerfScore 4.20, instruction count 3, allocated bytes for code 17 (MethodHash=95fd8e8e) for method System.BitConverter:GetBytes(System.Half):ubyte[] (FullOpts) ```
MihuBot commented 1 year ago

@MihaZupan