dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.27k stars 4.73k forks source link

Regressions in System.Buffers.Binary.Tests.BinaryReadAndWriteTests #85988

Open performanceautofiler[bot] opened 1 year ago

performanceautofiler[bot] commented 1 year ago

Run Information

Name Value
Architecture x64
OS Windows 10.0.18362
Queue TigerWindows
Baseline 442141d696ee7d4ae54420974559e99738d753e6
Compare c8f43d52f699c1da289a5cb36d371aec15901ab8
Diff Diff
Configs CompilationMode:tiered, RunKind:micro

Regressions in System.Buffers.Binary.Tests.BinaryReadAndWriteTests

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
[MeasureReverseUsingNtoH - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Windows 10.0.18362/System.Buffers.Binary.Tests.BinaryReadAndWriteTests.MeasureReverseUsingNtoH.html>) 605.61 ns 946.10 ns 1.56 0.01 True 6022.827101520892 8089.7074446351235 1.3431744442061604) Trace Trace
[MeasureReverseEndianness - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Windows 10.0.18362/System.Buffers.Binary.Tests.BinaryReadAndWriteTests.MeasureReverseEndianness.html>) 614.26 ns 941.83 ns 1.53 0.00 True 6056.527359489527 8075.261900385958 1.333315515818388) Trace Trace

Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Payloads

Baseline Compare

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Buffers.Binary.Tests.BinaryReadAndWriteTests*'
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Buffers.Binary.Tests.BinaryReadAndWriteTests.MeasureReverseUsingNtoH ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 946.1020211179717 > 643.259435520526. IsChangePoint: Marked as a change because one of 5/3/2023 7:05:50 AM, 5/9/2023 7:24:34 AM falls between 4/30/2023 6:17:41 PM and 5/9/2023 7:24:34 AM. IsRegressionStdDev: Marked as regression because -145.33095665139552 (T) = (0 -943.5256872067196) / Math.Sqrt((100.76093042903165 / (21)) + (1.9921542971535473 / (13))) is less than -2.03693334345674 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (21) + (13) - 2, .025) and -0.5214730941818898 = (620.1395810512588 - 943.5256872067196) / 620.1395810512588 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked as regression because Edge Detector said so. ``` ### JIT Disasms [Baseline](https://pvscmdupload.z22.web.core.windows.net/autofilereport/jitdasms/05_09_2023/System_Buffers_Binary_Tests_BinaryReadAndWriteTests_MeasureReverseUsingNtoH_baseline_e46a8842-52af-4c10-be68-9adb55b85885.log) [Compare](https://pvscmdupload.z22.web.core.windows.net/autofilereport/jitdasms/05_09_2023/System_Buffers_Binary_Tests_BinaryReadAndWriteTests_MeasureReverseUsingNtoH_compare_e46a8842-52af-4c10-be68-9adb55b85885.log) [Diff](https://perfsupport.azurewebsites.net/diff?old=https%3A%2F%2Fpvscmdupload.blob.core.windows.net%2Fautofilereport%2Fjitdasms%2F05_09_2023%2FSystem_Buffers_Binary_Tests_BinaryReadAndWriteTests_MeasureReverseUsingNtoH_baseline_e46a8842-52af-4c10-be68-9adb55b85885.log&new=https%3A%2F%2Fpvscmdupload.blob.core.windows.net%2Fautofilereport%2Fjitdasms%2F05_09_2023%2FSystem_Buffers_Binary_Tests_BinaryReadAndWriteTests_MeasureReverseUsingNtoH_compare_e46a8842-52af-4c10-be68-9adb55b85885.log) #### System.Buffers.Binary.Tests.BinaryReadAndWriteTests.MeasureReverseEndianness ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 941.8282251056002 > 645.1990668248882. IsChangePoint: Marked as a change because one of 5/3/2023 7:05:50 AM, 5/9/2023 7:24:34 AM falls between 4/30/2023 6:17:41 PM and 5/9/2023 7:24:34 AM. IsRegressionStdDev: Marked as regression because -172.69616839909932 (T) = (0 -942.899495537063) / Math.Sqrt((71.28264791595957 / (21)) + (0.9109916109751146 / (13))) is less than -2.03693334345674 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (21) + (13) - 2, .025) and -0.5172382205961809 = (621.4577794952738 - 942.899495537063) / 621.4577794952738 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked as regression because Edge Detector said so. ``` ### JIT Disasms [Baseline](https://pvscmdupload.z22.web.core.windows.net/autofilereport/jitdasms/05_09_2023/System_Buffers_Binary_Tests_BinaryReadAndWriteTests_MeasureReverseEndianness_baseline_714ce87e-fe76-45ef-bc0a-369799fa8218.log) [Compare](https://pvscmdupload.z22.web.core.windows.net/autofilereport/jitdasms/05_09_2023/System_Buffers_Binary_Tests_BinaryReadAndWriteTests_MeasureReverseEndianness_compare_714ce87e-fe76-45ef-bc0a-369799fa8218.log) [Diff](https://perfsupport.azurewebsites.net/diff?old=https%3A%2F%2Fpvscmdupload.blob.core.windows.net%2Fautofilereport%2Fjitdasms%2F05_09_2023%2FSystem_Buffers_Binary_Tests_BinaryReadAndWriteTests_MeasureReverseEndianness_baseline_714ce87e-fe76-45ef-bc0a-369799fa8218.log&new=https%3A%2F%2Fpvscmdupload.blob.core.windows.net%2Fautofilereport%2Fjitdasms%2F05_09_2023%2FSystem_Buffers_Binary_Tests_BinaryReadAndWriteTests_MeasureReverseEndianness_compare_714ce87e-fe76-45ef-bc0a-369799fa8218.log) ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)
ghost commented 1 year ago

Tagging subscribers to this area: @dotnet/area-system-buffers See info in area-owners.md if you want to be subscribed.

Issue Details
### Run Information Name | Value -- | -- Architecture | x64 OS | Windows 10.0.18362 Queue | TigerWindows Baseline | [442141d696ee7d4ae54420974559e99738d753e6](https://github.com/dotnet/runtime/commit/442141d696ee7d4ae54420974559e99738d753e6) Compare | [c8f43d52f699c1da289a5cb36d371aec15901ab8](https://github.com/dotnet/runtime/commit/c8f43d52f699c1da289a5cb36d371aec15901ab8) Diff | [Diff](https://github.com/dotnet/runtime/compare/442141d696ee7d4ae54420974559e99738d753e6...c8f43d52f699c1da289a5cb36d371aec15901ab8) Configs | CompilationMode:tiered, RunKind:micro ### Regressions in System.Buffers.Binary.Tests.BinaryReadAndWriteTests Benchmark | Baseline | Test | Test/Base | Test Quality | Edge Detector | Baseline IR | Compare IR | IR Ratio | Baseline ETL | Compare ETL -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- [MeasureReverseUsingNtoH - Duration of single invocation]() | 605.61 ns | 946.10 ns | 1.56 | 0.01 | True | 6022.827101520892 | 8089.7074446351235 | 1.3431744442061604) | [Trace](https://helixri107v0xdeko0k025g8.blob.core.windows.net/results-7d1f68952faf461983/Collect%20System.Buffers.Binary.Tests.BinaryReadAndWriteTests.MeasureReverseUsingNtoH/1/artifacts/BenchmarkDotNet.Artifacts/System.Buffers.Binary.Tests.BinaryReadAndWriteTests.MeasureReverseUsingNtoH-20230509-040058.etl?sv=2021-08-06&se=2023-08-07T10%3A58%3A29Z&sr=c&sp=rl&sig=iFe1IPzXlvJk8NMPNxIoYsJl0zxqZICtkNwI%2BCEpxz0%3D) | [Trace](https://helixri107v0xdeko0k025g8.blob.core.windows.net/results-8b25d50e327642139d/Collect%20System.Buffers.Binary.Tests.BinaryReadAndWriteTests.MeasureReverseUsingNtoH/1/artifacts/BenchmarkDotNet.Artifacts/System.Buffers.Binary.Tests.BinaryReadAndWriteTests.MeasureReverseUsingNtoH-20230509-040042.etl?sv=2021-08-06&se=2023-08-07T10%3A58%3A28Z&sr=c&sp=rl&sig=preS4K0zW7U4DybX5xLkt5QEWzlHWCBSGq%2BOz6V8K4g%3D) [MeasureReverseEndianness - Duration of single invocation]() | 614.26 ns | 941.83 ns | 1.53 | 0.00 | True | 6056.527359489527 | 8075.261900385958 | 1.333315515818388) | [Trace](https://helixri107v0xdeko0k025g8.blob.core.windows.net/results-a371e5f9b66f466894/Collect%20System.Buffers.Binary.Tests.BinaryReadAndWriteTests.MeasureReverseEndianness/1/artifacts/BenchmarkDotNet.Artifacts/System.Buffers.Binary.Tests.BinaryReadAndWriteTests.MeasureReverseEndianness-20230509-040104.etl?sv=2021-08-06&se=2023-08-07T10%3A58%3A31Z&sr=c&sp=rl&sig=pLCeCgHIBK0GMspxgln%2B7hA4uVg2uOUdZeVLbqo8VqM%3D) | [Trace](https://helixri107v0xdeko0k025g8.blob.core.windows.net/results-b1cc38a45f9e447cab/Collect%20System.Buffers.Binary.Tests.BinaryReadAndWriteTests.MeasureReverseEndianness/1/artifacts/BenchmarkDotNet.Artifacts/System.Buffers.Binary.Tests.BinaryReadAndWriteTests.MeasureReverseEndianness-20230509-040051.etl?sv=2021-08-06&se=2023-08-07T10%3A58%3A31Z&sr=c&sp=rl&sig=Iyl6baSPkJyP71TK7HoNVjOrr5z7oQNlRkxy%2FhVYtv8%3D) [Test Report]() ### Repro General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md ### Payloads [Baseline]() [Compare]() ```cmd git clone https://github.com/dotnet/performance.git py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Buffers.Binary.Tests.BinaryReadAndWriteTests*' ```
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Buffers.Binary.Tests.BinaryReadAndWriteTests.MeasureReverseUsingNtoH ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 946.1020211179717 > 643.259435520526. IsChangePoint: Marked as a change because one of 5/3/2023 7:05:50 AM, 5/9/2023 7:24:34 AM falls between 4/30/2023 6:17:41 PM and 5/9/2023 7:24:34 AM. IsRegressionStdDev: Marked as regression because -145.33095665139552 (T) = (0 -943.5256872067196) / Math.Sqrt((100.76093042903165 / (21)) + (1.9921542971535473 / (13))) is less than -2.03693334345674 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (21) + (13) - 2, .025) and -0.5214730941818898 = (620.1395810512588 - 943.5256872067196) / 620.1395810512588 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked as regression because Edge Detector said so. ``` ### JIT Disasms [Baseline](https://pvscmdupload.blob.core.windows.net/autofilereport/jitdasms/05_09_2023/System_Buffers_Binary_Tests_BinaryReadAndWriteTests_MeasureReverseUsingNtoH_baseline_e46a8842-52af-4c10-be68-9adb55b85885.log) [Compare](https://pvscmdupload.blob.core.windows.net/autofilereport/jitdasms/05_09_2023/System_Buffers_Binary_Tests_BinaryReadAndWriteTests_MeasureReverseUsingNtoH_compare_e46a8842-52af-4c10-be68-9adb55b85885.log) [Diff](https://perfsupport.azurewebsites.net/diff?old=https%3A%2F%2Fpvscmdupload.blob.core.windows.net%2Fautofilereport%2Fjitdasms%2F05_09_2023%2FSystem_Buffers_Binary_Tests_BinaryReadAndWriteTests_MeasureReverseUsingNtoH_baseline_e46a8842-52af-4c10-be68-9adb55b85885.log&new=https%3A%2F%2Fpvscmdupload.blob.core.windows.net%2Fautofilereport%2Fjitdasms%2F05_09_2023%2FSystem_Buffers_Binary_Tests_BinaryReadAndWriteTests_MeasureReverseUsingNtoH_compare_e46a8842-52af-4c10-be68-9adb55b85885.log) #### System.Buffers.Binary.Tests.BinaryReadAndWriteTests.MeasureReverseEndianness ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 941.8282251056002 > 645.1990668248882. IsChangePoint: Marked as a change because one of 5/3/2023 7:05:50 AM, 5/9/2023 7:24:34 AM falls between 4/30/2023 6:17:41 PM and 5/9/2023 7:24:34 AM. IsRegressionStdDev: Marked as regression because -172.69616839909932 (T) = (0 -942.899495537063) / Math.Sqrt((71.28264791595957 / (21)) + (0.9109916109751146 / (13))) is less than -2.03693334345674 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (21) + (13) - 2, .025) and -0.5172382205961809 = (621.4577794952738 - 942.899495537063) / 621.4577794952738 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked as regression because Edge Detector said so. ``` ### JIT Disasms [Baseline](https://pvscmdupload.blob.core.windows.net/autofilereport/jitdasms/05_09_2023/System_Buffers_Binary_Tests_BinaryReadAndWriteTests_MeasureReverseEndianness_baseline_714ce87e-fe76-45ef-bc0a-369799fa8218.log) [Compare](https://pvscmdupload.blob.core.windows.net/autofilereport/jitdasms/05_09_2023/System_Buffers_Binary_Tests_BinaryReadAndWriteTests_MeasureReverseEndianness_compare_714ce87e-fe76-45ef-bc0a-369799fa8218.log) [Diff](https://perfsupport.azurewebsites.net/diff?old=https%3A%2F%2Fpvscmdupload.blob.core.windows.net%2Fautofilereport%2Fjitdasms%2F05_09_2023%2FSystem_Buffers_Binary_Tests_BinaryReadAndWriteTests_MeasureReverseEndianness_baseline_714ce87e-fe76-45ef-bc0a-369799fa8218.log&new=https%3A%2F%2Fpvscmdupload.blob.core.windows.net%2Fautofilereport%2Fjitdasms%2F05_09_2023%2FSystem_Buffers_Binary_Tests_BinaryReadAndWriteTests_MeasureReverseEndianness_compare_714ce87e-fe76-45ef-bc0a-369799fa8218.log) ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)
Author: performanceautofiler[bot]
Assignees: AndyAyersMS
Labels: `area-System.Buffers`, `os-windows`, `arch-x64`, `runtime-coreclr`
Milestone: -
cincuranet commented 1 year ago

Commit range is https://github.com/dotnet/runtime/compare/7afd85d1fd0b9edf0e2b58108caf74509930c6e5...edd6d6326f853be985172c3b7514acf33675ec82. Probably https://github.com/dotnet/runtime/pull/85654, @jakobbotsch? Link to the benchmark.

Similar stuff in https://github.com/dotnet/runtime/issues/85994.

jakobbotsch commented 1 year ago

Seems unlikely, #85654 had no diffs related to those benchmarks. But I don't see how the other commits would affect those benchmarks either.

AndyAyersMS commented 1 year ago

Also could be https://github.com/dotnet/runtime/pull/85559 -- perhaps, some empty arrays got moved and so perturbed the alignment of the benchmark array?

tannergooding commented 1 year ago

This one is worth investigating more. It remains almost 1.5x slower with no clear culprit.

jakobbotsch commented 1 year ago

I'll take a closer look.

jakobbotsch commented 1 year ago

Yes this is from #85559, cc @EgorBo.

We end up putting a static array field used in the micro benchmark on the non-GC heap, but then we constant propagate its address into a loop and regress performance of the loop.

I wonder if we can generally avoid propagating from low weight blocks to high weight blocks, although it's probably not that simple. It also doesn't seem feasible to fix this in 8.0, so I will move this to 9.0.

-Importing BB01 (PC=000) of 'System.Buffers.Binary.Tests.BinaryReadAndWriteTests:MeasureReverseUsingNtoH():int[]:this'
-    [ 0]   0 (0x000) ldsfld 04000B90
-Checking if we can import 'static readonly' as a jit-time constant... 
-    [ 1]   5 (0x005) stloc.0Querying runtime about current class of field <unknown class>:<unknown field> (declared as int[])
-Runtime reports field is init-only and initialized and has class int[]
-
-lvaUpdateClass: Updating class for V01 from (00007FFAE8373060) int[] to (00007FFAE8373060) int[] [exact]
-
-
-STMT00000 ( 0x000[E-] ... ??? )
-               [000003] -A--G------                         ▌  ASG       ref   
-               [000002] D------N---                         ├──▌  LCL_VAR   ref    V01 loc0         
-               [000001] #---G------                         └──▌  IND       ref   
-               [000000] H----------                            └──▌  CNS_INT(h) long   0x1f0ccc06c08 const ptr Fseq[<unknown field>]

+Importing BB01 (PC=000) of 'System.Buffers.Binary.Tests.BinaryReadAndWriteTests:MeasureReverseUsingNtoH():int[]:this'
+    [ 0]   0 (0x000) ldsfld 04000B90
+Checking if we can import 'static readonly' as a jit-time constant... ... success! The value is:
+               [000000] H----------                         ▌  CNS_INT(h) ref   
+
+    [ 1]   5 (0x005) stloc.0
+
+STMT00000 ( 0x000[E-] ... ??? )
+               [000002] -A---------                         ▌  ASG       ref   
+               [000001] D------N---                         ├──▌  LCL_VAR   ref    V01 loc0         
+               [000000] H----------                         └──▌  CNS_INT(h) ref   

System.Buffers.Binary.Tests.BinaryReadAndWriteTests.MeasureReverseUsingNtoH

Hot functions:

Diffs ### ``[MicroBenchmarks]BinaryReadAndWriteTests.MeasureReverseUsingNtoH()`` ```diff ; Final local variable assignments ; ;* V00 this [V00 ] ( 0, 0 ) ref -> zero-ref this class-hnd single-def -; V01 loc0 [V01,T03] ( 6, 10 ) ref -> rdx class-hnd exact +; V01 loc0 [V01,T03] ( 6, 10 ) ref -> rdx class-hnd ; V02 loc1 [V02,T00] ( 12, 18.08) int -> rax ; V03 OutArgs [V03 ] ( 1, 1 ) struct (32) [rsp+00H] do-not-enreg[XS] addr-exposed "OutgoingArgSpace" ; V04 tmp1 [V04,T01] ( 4, 16 ) int -> r9 "Strict ordering of exceptions for Array store" @@ -175,36 +175,39 @@ G_M5635_IG08: ; Final local variable assignments ; ;* V00 this [V00 ] ( 0, 0 ) ref -> zero-ref this class-hnd single-def -; V01 loc0 [V01,T03] ( 4, 10 ) ref -> rax class-hnd exact single-def -; V02 loc1 [V02,T00] ( 5, 17 ) int -> rdx +;* V01 loc0 [V01,T03] ( 0, 0 ) ref -> zero-ref class-hnd single-def +; V02 loc1 [V02,T00] ( 5, 17 ) int -> rax ;# V03 OutArgs [V03 ] ( 1, 1 ) struct ( 0) [rsp+00H] do-not-enreg[XS] addr-exposed "OutgoingArgSpace" -; V04 tmp1 [V04,T01] ( 2, 16 ) int -> r8 "Strict ordering of exceptions for Array store" +; V04 tmp1 [V04,T01] ( 2, 16 ) int -> rcx "Strict ordering of exceptions for Array store" ;* V05 tmp2 [V05 ] ( 0, 0 ) int -> zero-ref "Inlining Arg" ;* V06 tmp3 [V06 ] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" -; V07 cse0 [V07,T02] ( 3, 12 ) long -> rcx "CSE - aggressive" +; V07 cse0 [V07,T02] ( 3, 12 ) long -> rdx "CSE - aggressive" ; ; Lcl frame size = 0 G_M5635_IG01: ;; size=0 bbWeight=1 PerfScore 0.00 G_M5635_IG02: - mov rax, 0xD1FFAB1E ; const ptr - mov rax, gword ptr [rax] - xor edx, edx - align [1 bytes for IG03] - ;; size=16 bbWeight=1 PerfScore 2.75 + xor eax, eax + align [0 bytes for IG03] + ;; size=2 bbWeight=1 PerfScore 0.25 G_M5635_IG03: - mov ecx, edx - movbe r8d, dword ptr [rax+4*rcx+10H] - mov dword ptr [rax+4*rcx+10H], r8d - inc edx - cmp edx, 1 + mov edx, eax + mov rcx, 0xD1FFAB1E + movbe ecx, dword ptr [rcx+4*rdx] + mov r8, 0xD1FFAB1E + mov dword ptr [r8+4*rdx], ecx + inc eax + cmp eax, 1 jl SHORT G_M5635_IG03 - ;; size=21 bbWeight=4 PerfScore 23.00 + ;; size=38 bbWeight=4 PerfScore 25.00 G_M5635_IG04: + mov rax, 0xD1FFAB1E + ;; size=10 bbWeight=1 PerfScore 0.25 +G_M5635_IG05: ret ;; size=1 bbWeight=1 PerfScore 1.00 -; Total bytes of code 38, prolog size 0, PerfScore 30.55, instruction count 11, allocated bytes for code 38 (MethodHash=87c1e9fc) for method System.Buffers.Binary.Tests.BinaryReadAndWriteTests:MeasureReverseUsingNtoH():int[]:this +; Total bytes of code 51, prolog size 0, PerfScore 31.60, instruction count 12, allocated bytes for code 51 (MethodHash=87c1e9fc) for method System.Buffers.Binary.Tests.BinaryReadAndWriteTests:MeasureReverseUsingNtoH():int[]:this ; ============================================================ ```

System.Buffers.Binary.Tests.BinaryReadAndWriteTests.MeasureReverseEndianness

Hot functions:

Diffs ### ``[MicroBenchmarks]BinaryReadAndWriteTests.MeasureReverseEndianness()`` ```diff ; Final local variable assignments ; ;* V00 this [V00 ] ( 0, 0 ) ref -> zero-ref this class-hnd single-def -; V01 loc0 [V01,T03] ( 6, 10 ) ref -> rdx class-hnd exact +; V01 loc0 [V01,T03] ( 6, 10 ) ref -> rdx class-hnd ; V02 loc1 [V02,T00] ( 12, 18.08) int -> rax ; V03 OutArgs [V03 ] ( 1, 1 ) struct (32) [rsp+00H] do-not-enreg[XS] addr-exposed "OutgoingArgSpace" ; V04 tmp1 [V04,T01] ( 4, 16 ) int -> r9 "Strict ordering of exceptions for Array store" @@ -171,34 +171,37 @@ G_M42778_IG08: ; Final local variable assignments ; ;* V00 this [V00 ] ( 0, 0 ) ref -> zero-ref this class-hnd single-def -; V01 loc0 [V01,T03] ( 4, 10 ) ref -> rax class-hnd exact single-def -; V02 loc1 [V02,T00] ( 5, 17 ) int -> rdx +;* V01 loc0 [V01,T03] ( 0, 0 ) ref -> zero-ref class-hnd single-def +; V02 loc1 [V02,T00] ( 5, 17 ) int -> rax ;# V03 OutArgs [V03 ] ( 1, 1 ) struct ( 0) [rsp+00H] do-not-enreg[XS] addr-exposed "OutgoingArgSpace" -; V04 tmp1 [V04,T01] ( 2, 16 ) int -> r8 "Strict ordering of exceptions for Array store" -; V05 cse0 [V05,T02] ( 3, 12 ) long -> rcx "CSE - aggressive" +; V04 tmp1 [V04,T01] ( 2, 16 ) int -> rcx "Strict ordering of exceptions for Array store" +; V05 cse0 [V05,T02] ( 3, 12 ) long -> rdx "CSE - aggressive" ; ; Lcl frame size = 0 G_M42778_IG01: ;; size=0 bbWeight=1 PerfScore 0.00 G_M42778_IG02: - mov rax, 0xD1FFAB1E ; const ptr - mov rax, gword ptr [rax] - xor edx, edx - align [1 bytes for IG03] - ;; size=16 bbWeight=1 PerfScore 2.75 + xor eax, eax + align [0 bytes for IG03] + ;; size=2 bbWeight=1 PerfScore 0.25 G_M42778_IG03: - mov ecx, edx - movbe r8d, dword ptr [rax+4*rcx+10H] - mov dword ptr [rax+4*rcx+10H], r8d - inc edx - cmp edx, 1 + mov edx, eax + mov rcx, 0xD1FFAB1E + movbe ecx, dword ptr [rcx+4*rdx] + mov r8, 0xD1FFAB1E + mov dword ptr [r8+4*rdx], ecx + inc eax + cmp eax, 1 jl SHORT G_M42778_IG03 - ;; size=21 bbWeight=4 PerfScore 23.00 + ;; size=38 bbWeight=4 PerfScore 25.00 G_M42778_IG04: + mov rax, 0xD1FFAB1E + ;; size=10 bbWeight=1 PerfScore 0.25 +G_M42778_IG05: ret ;; size=1 bbWeight=1 PerfScore 1.00 -; Total bytes of code 38, prolog size 0, PerfScore 30.55, instruction count 11, allocated bytes for code 38 (MethodHash=bce858e5) for method System.Buffers.Binary.Tests.BinaryReadAndWriteTests:MeasureReverseEndianness():int[]:this +; Total bytes of code 51, prolog size 0, PerfScore 31.60, instruction count 12, allocated bytes for code 51 (MethodHash=bce858e5) for method System.Buffers.Binary.Tests.BinaryReadAndWriteTests:MeasureReverseEndianness():int[]:this ; ============================================================ ```
jakobbotsch commented 1 year ago

I'll actually leave this in 8.0 for a bit, maybe someone else has an idea about something we can do.

AndyAyersMS commented 1 year ago
image

Only seems to affect intel x64.

jakobbotsch commented 1 year ago

I think we should do something here, but I do not see any possible surgical fix. I think ideally this is fixed by 1) avoiding this propagation of constants into loops/higher weighted blocks in the first place, and 2) some rematerialization support for constants to not regress register allocation when we do 1. Will move this to 9.0.

dotnet-policy-service[bot] commented 6 months ago

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch See info in area-owners.md if you want to be subscribed.

jakobbotsch commented 3 months ago

Moving to 10 for the same reason as the above. We sadly did not get around to revisiting some of the necessities for improving cases like this one.