dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.17k stars 4.72k forks source link

Regressions in System.Collections.Sort #90933

Closed performanceautofiler[bot] closed 5 months ago

performanceautofiler[bot] commented 1 year ago

Run Information

Name Value
Architecture x64
OS Windows 10.0.18362
Queue TigerWindows
Baseline 71a3d36e5cdaf1e2d41957fb1d2cebff8af0e063
Compare 9d53816dd6d2a69bc0b8592014f3b7a5640685c4
Diff Diff
Configs CompilationMode:tiered, RunKind:micro

Regressions in System.Collections.Sort<IntStruct>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio
5.11 μs 8.79 μs 1.72 0.51 False
5.11 μs 8.58 μs 1.68 0.53 False

graph graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Collections.Sort&lt;IntStruct&gt;*'
### Payloads [Baseline]() [Compare]() ### System.Collections.Sort<IntStruct>.Array(Size: 512) #### ETL Files #### Histogram #### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because we could not find enough baseline builds for window checking. IsChangePoint: Marked as a change because one of 5/1/2023 3:42:23 PM, 8/14/2023 9:42:18 AM, 8/21/2023 10:38:45 PM falls between 8/13/2023 2:50:54 AM and 8/21/2023 10:38:45 PM. IsRegressionStdDev: Marked as regression because -22.302485870313667 (T) = (0 -8993.261476472246) / Math.Sqrt((58368.38135392599 / (4)) + (235091.61196554397 / (13))) is less than -2.131449545559758 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (4) + (13) - 2, .025) and -0.81251891080426 = (4961.747666666667 - 8993.261476472246) / 4961.747666666667 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### JIT Disasms ### System.Collections.Sort<IntStruct>.List(Size: 512) #### ETL Files #### Histogram #### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because we could not find enough baseline builds for window checking. IsChangePoint: Marked as a change because one of 5/1/2023 3:42:23 PM, 8/14/2023 9:42:18 AM, 8/21/2023 10:38:45 PM falls between 8/13/2023 2:50:54 AM and 8/21/2023 10:38:45 PM. IsRegressionStdDev: Marked as regression because -23.69456434037528 (T) = (0 -8678.07050220088) / Math.Sqrt((93129.27371496327 / (4)) + (15244.993484834415 / (14))) is less than -2.1199052992212764 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (4) + (14) - 2, .025) and -0.7429194893826071 = (4979.042666666666 - 8678.07050220088) / 4979.042666666666 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### JIT Disasms ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)
EgorBo commented 1 year ago

Only win-x64-Intel machine regressed, suspects are https://github.com/dotnet/runtime/pull/90325 cc @tannergooding and https://github.com/dotnet/runtime/pull/90318 cc @jakobbotsch

ghost commented 1 year ago

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch See info in area-owners.md if you want to be subscribed.

Issue Details
### Run Information Name | Value -- | -- Architecture | x64 OS | Windows 10.0.18362 Queue | TigerWindows Baseline | [71a3d36e5cdaf1e2d41957fb1d2cebff8af0e063](https://github.com/dotnet/runtime/commit/71a3d36e5cdaf1e2d41957fb1d2cebff8af0e063) Compare | [9d53816dd6d2a69bc0b8592014f3b7a5640685c4](https://github.com/dotnet/runtime/commit/9d53816dd6d2a69bc0b8592014f3b7a5640685c4) Diff | [Diff](https://github.com/dotnet/runtime/compare/71a3d36e5cdaf1e2d41957fb1d2cebff8af0e063...9d53816dd6d2a69bc0b8592014f3b7a5640685c4) Configs | CompilationMode:tiered, RunKind:micro ### Regressions in System.Collections.Sort<IntStruct> Benchmark | Baseline | Test | Test/Base | Test Quality | Edge Detector | Baseline IR | Compare IR | IR Ratio -- | -- | -- | -- | -- | -- | -- | -- | -- |
  • [Array - Duration of single invocation]()
  • 📝 - [Benchmark Source]()
  • [ADX - Test Multi Config Graph]()
| 5.11 μs | 8.79 μs | 1.72 | 0.51 | False | | | |
  • [List - Duration of single invocation]()
  • 📝 - [Benchmark Source]()
  • [ADX - Test Multi Config Graph]()
| 5.11 μs | 8.58 μs | 1.68 | 0.53 | False | | | ![graph]() ![graph]() [Test Report]() ### Repro General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md ```cmd git clone https://github.com/dotnet/performance.git py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Collections.Sort<IntStruct>*' ```
### Payloads [Baseline]() [Compare]() ### System.Collections.Sort<IntStruct>.Array(Size: 512) #### ETL Files #### Histogram #### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because we could not find enough baseline builds for window checking. IsChangePoint: Marked as a change because one of 5/1/2023 3:42:23 PM, 8/14/2023 9:42:18 AM, 8/21/2023 10:38:45 PM falls between 8/13/2023 2:50:54 AM and 8/21/2023 10:38:45 PM. IsRegressionStdDev: Marked as regression because -22.302485870313667 (T) = (0 -8993.261476472246) / Math.Sqrt((58368.38135392599 / (4)) + (235091.61196554397 / (13))) is less than -2.131449545559758 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (4) + (13) - 2, .025) and -0.81251891080426 = (4961.747666666667 - 8993.261476472246) / 4961.747666666667 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### JIT Disasms ### System.Collections.Sort<IntStruct>.List(Size: 512) #### ETL Files #### Histogram #### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because we could not find enough baseline builds for window checking. IsChangePoint: Marked as a change because one of 5/1/2023 3:42:23 PM, 8/14/2023 9:42:18 AM, 8/21/2023 10:38:45 PM falls between 8/13/2023 2:50:54 AM and 8/21/2023 10:38:45 PM. IsRegressionStdDev: Marked as regression because -23.69456434037528 (T) = (0 -8678.07050220088) / Math.Sqrt((93129.27371496327 / (4)) + (15244.993484834415 / (14))) is less than -2.1199052992212764 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (4) + (14) - 2, .025) and -0.7429194893826071 = (4979.042666666666 - 8678.07050220088) / 4979.042666666666 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### JIT Disasms ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)
Author: performanceautofiler[bot]
Assignees: -
Labels: `os-windows`, `tenet-performance`, `tenet-performance-benchmarks`, `arch-x64`, `area-CodeGen-coreclr`, `untriaged`, `runtime-coreclr`
Milestone: -
jakobbotsch commented 1 year ago

There's a bunch of RA diffs but I don't really see why they would cause perf diffs, maybe JCC erratum? There's a few new instances of those (but some old instances also disappear) when I check with DOTNET_JitDisasmWithAlignmentBoundaries=1.

FWIW, the diffs seem to be triggered by #90496 kicking in @EgorBo:

image

Seems like we shouldn't be creating new locals for this case if the previous logic didn't need to (but it's not the cause of the diffs, I guess the cause is the early folding away of the BB?)

System.Collections.Sort<IntStruct>.Array(Size: 512)

Hot functions:

Diffs ### ``[System.Private.CoreLib]System.Collections.Generic.GenericArraySortHelper`1[System.Collections.IntStruct].PickPivotAndPartition(value class System.Span`1)`` ```diff ; optimized using Dynamic PGO ; rsp based frame ; fully interruptible -; with Dynamic PGO: edge weights are invalid, and fgCalledCount is 160864 +; with Dynamic PGO: edge weights are invalid, and fgCalledCount is 143168 ; 8 inlinees with PGO data; 12 single block inlinees; 5 inlinees without PGO data ; Final local variable assignments ; ; V00 arg0 [V00,T16] ( 4, 8 ) byref -> rcx ld-addr-op single-def -; V01 loc0 [V01,T13] ( 17, 49.29) byref -> rdx single-def -; V02 loc1 [V02,T19] ( 7, 4.84) byref -> r8 single-def -; V03 loc2 [V03,T17] ( 10, 7.25) byref -> r10 single-def -; V04 loc3 [V04,T14] ( 9, 41.57) byref -> rcx single-def +; V01 loc0 [V01,T13] ( 17, 49.67) byref -> rdx single-def +; V02 loc1 [V02,T19] ( 7, 4.83) byref -> r8 single-def +; V03 loc2 [V03,T17] ( 10, 7.24) byref -> r10 single-def +; V04 loc3 [V04,T14] ( 9, 41.09) byref -> rcx single-def ;* V05 loc4 [V05 ] ( 0, 0 ) struct ( 8) zero-ref ld-addr-op -; V06 loc5 [V06,T01] ( 12,184.21) byref -> rax -; V07 loc6 [V07,T00] ( 8,192.45) byref -> r10 +; V06 loc5 [V06,T01] ( 12,182.42) byref -> rax +; V07 loc6 [V07,T00] ( 8,194.20) byref -> r10 ;# V08 OutArgs [V08 ] ( 1, 1 ) struct ( 0) [rsp+0x00] do-not-enreg[XS] addr-exposed "OutgoingArgSpace" -; V09 tmp1 [V09,T04] ( 2,146.57) byref -> r10 "dup spill" -; V10 tmp2 [V10,T05] ( 2,134.37) byref -> rax "dup spill" +; V09 tmp1 [V09,T04] ( 2,148.13) byref -> r10 "dup spill" +; V10 tmp2 [V10,T05] ( 2,132.35) byref -> rax "dup spill" ;* V11 tmp3 [V11 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "Inlining Arg" -;* V12 tmp4 [V12,T29] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" -;* V13 tmp5 [V13 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" -;* V14 tmp6 [V14,T30] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" -; V15 tmp7 [V15,T23] ( 2, 4.00) byref -> r9 single-def "Inlining Arg" -; V16 tmp8 [V16,T20] ( 3, 4.80) int -> rax "Inlining Arg" -;* V17 tmp9 [V17 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" -;* V18 tmp10 [V18,T36] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" -;* V19 tmp11 [V19 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" -;* V20 tmp12 [V20,T37] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" -; V21 tmp13 [V21,T24] ( 2, 4.00) byref -> r9 single-def "Inlining Arg" -; V22 tmp14 [V22,T21] ( 3, 4.80) int -> rax "Inlining Arg" -;* V23 tmp15 [V23 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" -;* V24 tmp16 [V24,T31] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" -;* V25 tmp17 [V25 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" -;* V26 tmp18 [V26,T32] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" -; V27 tmp19 [V27,T25] ( 2, 4.00) byref -> r9 single-def "Inlining Arg" -; V28 tmp20 [V28,T22] ( 3, 4.80) int -> rax "Inlining Arg" -;* V29 tmp21 [V29 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" -;* V30 tmp22 [V30 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" -;* V31 tmp23 [V31,T11] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" -;* V32 tmp24 [V32 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" -;* V33 tmp25 [V33,T12] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" -; V34 tmp26 [V34,T03] ( 3,161.23) int -> r8 "Inlining Arg" -;* V35 tmp27 [V35,T08] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" -;* V36 tmp28 [V36 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" -;* V37 tmp29 [V37,T09] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" -; V38 tmp30 [V38,T02] ( 3,175.88) int -> r8 "Inlining Arg" -;* V39 tmp31 [V39 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" -;* V40 tmp32 [V40 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" -; V41 tmp33 [V41,T38] ( 2, 2.00) byref -> rdx single-def "field V00._reference (fldOffset=0x0)" P-INDEP -; V42 tmp34 [V42,T27] ( 3, 3.00) int -> rcx "field V00._length (fldOffset=0x8)" P-INDEP -; V43 tmp35 [V43,T06] ( 5, 99.32) int -> r9 "field V05._value (fldOffset=0x0)" P-INDEP -;* V44 tmp36 [V44 ] ( 0, 0 ) byref -> zero-ref single-def "field V11._reference (fldOffset=0x0)" P-INDEP -;* V45 tmp37 [V45 ] ( 0, 0 ) int -> zero-ref "field V11._length (fldOffset=0x8)" P-INDEP -; V46 tmp38 [V46,T33] ( 2, 2.00) int -> rax "field V13._value (fldOffset=0x0)" P-INDEP -; V47 tmp39 [V47,T41] ( 2, 0.92) int -> rax "field V17._value (fldOffset=0x0)" P-INDEP -; V48 tmp40 [V48,T34] ( 2, 2.00) int -> rax "field V19._value (fldOffset=0x0)" P-INDEP -; V49 tmp41 [V49,T42] ( 2, 0.92) int -> rax "field V23._value (fldOffset=0x0)" P-INDEP -; V50 tmp42 [V50,T35] ( 2, 2.00) int -> rax "field V25._value (fldOffset=0x0)" P-INDEP -; V51 tmp43 [V51,T43] ( 2, 0.92) int -> rax "field V29._value (fldOffset=0x0)" P-INDEP -; V52 tmp44 [V52,T39] ( 2, 2.00) int -> rax "field V30._value (fldOffset=0x0)" P-INDEP -; V53 tmp45 [V53,T10] ( 2, 67.18) int -> r8 "field V32._value (fldOffset=0x0)" P-INDEP -; V54 tmp46 [V54,T07] ( 2, 73.29) int -> r8 "field V36._value (fldOffset=0x0)" P-INDEP -; V55 tmp47 [V55,T15] ( 2, 29.93) int -> r8 "field V39._value (fldOffset=0x0)" P-INDEP -; V56 tmp48 [V56,T40] ( 2, 1.93) int -> r10 "field V40._value (fldOffset=0x0)" P-INDEP -;* V57 tmp49 [V57 ] ( 0, 0 ) struct (16) zero-ref "Promoted implicit byref" -; V58 cse0 [V58,T26] ( 3, 3.00) int -> rax "CSE - moderate" -; V59 cse1 [V59,T28] ( 3, 3.00) int -> rax "CSE - moderate" -; V60 rat0 [V60,T18] ( 3, 6 ) long -> rax "ReplaceWithLclVar is creating a new local variable" +;* V12 tmp4 [V12 ] ( 0, 0 ) struct ( 8) zero-ref "spilling side-effects" +;* V13 tmp5 [V13,T28] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" +;* V14 tmp6 [V14 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" +;* V15 tmp7 [V15,T29] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" +;* V16 tmp8 [V16,T37] ( 0, 0 ) byref -> zero-ref single-def "Inlining Arg" +; V17 tmp9 [V17,T20] ( 3, 4.81) int -> rax "Inlining Arg" +;* V18 tmp10 [V18 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" +;* V19 tmp11 [V19 ] ( 0, 0 ) struct ( 8) zero-ref "spilling side-effects" +;* V20 tmp12 [V20,T38] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" +;* V21 tmp13 [V21 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" +;* V22 tmp14 [V22,T39] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" +; V23 tmp15 [V23,T23] ( 2, 4 ) byref -> r9 single-def "Inlining Arg" +; V24 tmp16 [V24,T21] ( 3, 4.81) int -> rax "Inlining Arg" +;* V25 tmp17 [V25 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" +;* V26 tmp18 [V26 ] ( 0, 0 ) struct ( 8) zero-ref "spilling side-effects" +;* V27 tmp19 [V27,T30] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" +;* V28 tmp20 [V28 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" +;* V29 tmp21 [V29,T31] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" +; V30 tmp22 [V30,T24] ( 2, 4 ) byref -> r9 single-def "Inlining Arg" +; V31 tmp23 [V31,T22] ( 3, 4.81) int -> rax "Inlining Arg" +;* V32 tmp24 [V32 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" +;* V33 tmp25 [V33 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" +;* V34 tmp26 [V34,T11] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" +;* V35 tmp27 [V35 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" +;* V36 tmp28 [V36,T12] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" +; V37 tmp29 [V37,T03] ( 3,159.05) int -> r8 "Inlining Arg" +;* V38 tmp30 [V38,T08] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" +;* V39 tmp31 [V39 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" +;* V40 tmp32 [V40,T09] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" +; V41 tmp33 [V41,T02] ( 3,178.01) int -> r8 "Inlining Arg" +;* V42 tmp34 [V42 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" +;* V43 tmp35 [V43 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" +; V44 tmp36 [V44,T32] ( 2, 2 ) byref -> rdx single-def "field V00._reference (fldOffset=0x0)" P-INDEP +; V45 tmp37 [V45,T25] ( 3, 3 ) int -> rcx "field V00._length (fldOffset=0x8)" P-INDEP +; V46 tmp38 [V46,T06] ( 5, 99.41) int -> r9 "field V05._value (fldOffset=0x0)" P-INDEP +;* V47 tmp39 [V47 ] ( 0, 0 ) byref -> zero-ref single-def "field V11._reference (fldOffset=0x0)" P-INDEP +;* V48 tmp40 [V48 ] ( 0, 0 ) int -> zero-ref "field V11._length (fldOffset=0x8)" P-INDEP +;* V49 tmp41 [V49 ] ( 0, 0 ) int -> zero-ref "field V12._value (fldOffset=0x0)" P-INDEP +; V50 tmp42 [V50,T33] ( 2, 2 ) int -> rax "field V14._value (fldOffset=0x0)" P-INDEP +; V51 tmp43 [V51,T41] ( 2, 0.92) int -> rax "field V18._value (fldOffset=0x0)" P-INDEP +;* V52 tmp44 [V52 ] ( 0, 0 ) int -> zero-ref "field V19._value (fldOffset=0x0)" P-INDEP +; V53 tmp45 [V53,T34] ( 2, 2 ) int -> rax "field V21._value (fldOffset=0x0)" P-INDEP +; V54 tmp46 [V54,T42] ( 2, 0.92) int -> rax "field V25._value (fldOffset=0x0)" P-INDEP +;* V55 tmp47 [V55 ] ( 0, 0 ) int -> zero-ref "field V26._value (fldOffset=0x0)" P-INDEP +; V56 tmp48 [V56,T35] ( 2, 2 ) int -> rax "field V28._value (fldOffset=0x0)" P-INDEP +; V57 tmp49 [V57,T43] ( 2, 0.92) int -> rax "field V32._value (fldOffset=0x0)" P-INDEP +; V58 tmp50 [V58,T36] ( 2, 2 ) int -> rax "field V33._value (fldOffset=0x0)" P-INDEP +; V59 tmp51 [V59,T10] ( 2, 66.17) int -> r8 "field V35._value (fldOffset=0x0)" P-INDEP +; V60 tmp52 [V60,T07] ( 2, 74.06) int -> r8 "field V39._value (fldOffset=0x0)" P-INDEP +; V61 tmp53 [V61,T15] ( 2, 30.05) int -> r8 "field V42._value (fldOffset=0x0)" P-INDEP +; V62 tmp54 [V62,T40] ( 2, 1.94) int -> r10 "field V43._value (fldOffset=0x0)" P-INDEP +;* V63 tmp55 [V63 ] ( 0, 0 ) struct (16) zero-ref "Promoted implicit byref" +; V64 cse0 [V64,T26] ( 3, 3 ) int -> rax "CSE - moderate" +; V65 cse1 [V65,T27] ( 3, 3 ) int -> rax "CSE - moderate" +; V66 rat0 [V66,T18] ( 3, 6 ) long -> rax "ReplaceWithLclVar is creating a new local variable" ; ; Lcl frame size = 0 @@ -562,60 +568,61 @@ G_M50248_IG02: lea r10, bword ptr [rdx+4*rax] cmp byte ptr [rdx], dl mov eax, dword ptr [r10] - mov r9, rdx - cmp dword ptr [r9], eax + cmp dword ptr [rdx], eax jl SHORT G_M50248_IG04 - ;; size=37 bbWeight=1 PerfScore 15.75 + ;; size=33 bbWeight=1 PerfScore 15.50 G_M50248_IG03: cmp dword ptr [rdx], eax jg SHORT G_M50248_IG07 - ;; size=4 bbWeight=0.40 PerfScore 1.60 + ;; size=4 bbWeight=0.40 PerfScore 1.61 G_M50248_IG04: mov eax, dword ptr [r8] mov r9, rdx cmp dword ptr [r9], eax - jl SHORT G_M50248_IG11 - ;; size=11 bbWeight=1.00 PerfScore 6.26 + jl SHORT G_M50248_IG12 + ;; size=11 bbWeight=1 PerfScore 6.25 G_M50248_IG05: cmp dword ptr [rdx], eax - jle SHORT G_M50248_IG11 - ;; size=4 bbWeight=0.40 PerfScore 1.60 + jle SHORT G_M50248_IG12 + ;; size=4 bbWeight=0.40 PerfScore 1.61 G_M50248_IG06: - jmp SHORT G_M50248_IG18 - ;; size=2 bbWeight=0.40 PerfScore 0.80 + jmp SHORT G_M50248_IG19 + ;; size=2 bbWeight=0.40 PerfScore 0.81 G_M50248_IG07: mov eax, dword ptr [rdx] mov r9d, dword ptr [r10] mov dword ptr [rdx], r9d mov dword ptr [r10], eax jmp SHORT G_M50248_IG04 - ;; size=13 bbWeight=0.46 PerfScore 3.68 + ;; size=13 bbWeight=0.46 PerfScore 3.67 G_M50248_IG08: cmp r10, rdx - jbe SHORT G_M50248_IG20 + jbe SHORT G_M50248_IG21 + ;; size=5 bbWeight=37.03 PerfScore 46.29 +G_M50248_IG09: add r10, -4 mov r8d, dword ptr [r10] cmp r9d, r8d jl SHORT G_M50248_IG08 - ;; size=17 bbWeight=36.64 PerfScore 174.06 -G_M50248_IG09: - cmp r9d, r8d - jle SHORT G_M50248_IG20 - ;; size=5 bbWeight=14.65 PerfScore 18.31 + ;; size=12 bbWeight=37.03 PerfScore 129.61 G_M50248_IG10: - jmp SHORT G_M50248_IG20 - ;; size=2 bbWeight=14.64 PerfScore 29.29 + cmp r9d, r8d + jle SHORT G_M50248_IG21 + ;; size=5 bbWeight=14.94 PerfScore 18.68 G_M50248_IG11: + jmp SHORT G_M50248_IG21 + ;; size=2 bbWeight=14.93 PerfScore 29.86 +G_M50248_IG12: mov eax, dword ptr [r8] mov r9, r10 cmp dword ptr [r9], eax - jl SHORT G_M50248_IG13 - ;; size=11 bbWeight=1.00 PerfScore 6.26 -G_M50248_IG12: - cmp dword ptr [r10], eax - jg SHORT G_M50248_IG19 - ;; size=5 bbWeight=0.40 PerfScore 1.60 + jl SHORT G_M50248_IG14 + ;; size=11 bbWeight=1 PerfScore 6.25 G_M50248_IG13: + cmp dword ptr [r10], eax + jg SHORT G_M50248_IG20 + ;; size=5 bbWeight=0.40 PerfScore 1.61 +G_M50248_IG14: add ecx, -2 movsxd rax, ecx lea rcx, bword ptr [rdx+4*rax] @@ -627,61 +634,61 @@ G_M50248_IG13: mov rax, rdx mov r10, rcx cmp rdx, rcx - jae SHORT G_M50248_IG22 - ;; size=35 bbWeight=1.00 PerfScore 9.01 -G_M50248_IG14: + jae SHORT G_M50248_IG23 + ;; size=35 bbWeight=1 PerfScore 9.00 +G_M50248_IG15: cmp rax, rcx jae SHORT G_M50248_IG08 - ;; size=5 bbWeight=33.63 PerfScore 42.04 -G_M50248_IG15: + ;; size=5 bbWeight=33.14 PerfScore 41.43 +G_M50248_IG16: add rax, 4 mov r8d, dword ptr [rax] cmp r9d, r8d jl SHORT G_M50248_IG08 - ;; size=12 bbWeight=33.59 PerfScore 117.57 -G_M50248_IG16: + ;; size=12 bbWeight=33.09 PerfScore 115.81 +G_M50248_IG17: cmp r9d, r8d jle SHORT G_M50248_IG08 - ;; size=5 bbWeight=13.43 PerfScore 16.79 -G_M50248_IG17: - jmp SHORT G_M50248_IG14 - ;; size=2 bbWeight=13.42 PerfScore 26.85 + ;; size=5 bbWeight=13.35 PerfScore 16.69 G_M50248_IG18: + jmp SHORT G_M50248_IG15 + ;; size=2 bbWeight=13.34 PerfScore 26.68 +G_M50248_IG19: mov eax, dword ptr [rdx] mov r9d, dword ptr [r8] mov dword ptr [rdx], r9d mov dword ptr [r8], eax - jmp SHORT G_M50248_IG11 - ;; size=13 bbWeight=0.46 PerfScore 3.68 -G_M50248_IG19: + jmp SHORT G_M50248_IG12 + ;; size=13 bbWeight=0.46 PerfScore 3.67 +G_M50248_IG20: mov eax, dword ptr [r10] mov r9d, dword ptr [r8] mov dword ptr [r10], r9d mov dword ptr [r8], eax - jmp SHORT G_M50248_IG13 - ;; size=14 bbWeight=0.46 PerfScore 3.68 -G_M50248_IG20: - cmp rax, r10 - jae SHORT G_M50248_IG22 - ;; size=5 bbWeight=14.95 PerfScore 18.68 + jmp SHORT G_M50248_IG14 + ;; size=14 bbWeight=0.46 PerfScore 3.67 G_M50248_IG21: + cmp rax, r10 + jae SHORT G_M50248_IG23 + ;; size=5 bbWeight=15.02 PerfScore 18.77 +G_M50248_IG22: mov r8d, dword ptr [rax] mov r11d, dword ptr [r10] mov dword ptr [rax], r11d mov dword ptr [r10], r8d - jmp SHORT G_M50248_IG14 - ;; size=14 bbWeight=14.96 PerfScore 119.70 -G_M50248_IG22: - cmp rax, rcx - je SHORT G_M50248_IG24 - ;; size=5 bbWeight=1 PerfScore 1.25 + jmp SHORT G_M50248_IG15 + ;; size=14 bbWeight=15.03 PerfScore 120.21 G_M50248_IG23: + cmp rax, rcx + je SHORT G_M50248_IG25 + ;; size=5 bbWeight=1 PerfScore 1.25 +G_M50248_IG24: mov r10d, dword ptr [rax] mov r9d, dword ptr [rcx] mov dword ptr [rax], r9d mov dword ptr [rcx], r10d - ;; size=12 bbWeight=0.97 PerfScore 5.80 -G_M50248_IG24: + ;; size=12 bbWeight=0.97 PerfScore 5.83 +G_M50248_IG25: sub rax, rdx mov rdx, rax sar rdx, 63 @@ -689,10 +696,10 @@ G_M50248_IG24: add rax, rdx sar rax, 2 ;; size=21 bbWeight=1 PerfScore 2.00 -G_M50248_IG25: +G_M50248_IG26: ret ;; size=1 bbWeight=1 PerfScore 1.00 -; Total bytes of code 255, prolog size 0, PerfScore 652.75, instruction count 93, allocated bytes for code 255 (MethodHash=e6183bb7) for method System.Collections.Generic.GenericArraySortHelper`1[System.Collections.IntStruct]:PickPivotAndPartition(System.Span`1[System.Collections.IntStruct]):int (Tier1) +; Total bytes of code 251, prolog size 0, PerfScore 652.86, instruction count 92, allocated bytes for code 251 (MethodHash=e6183bb7) for method System.Collections.Generic.GenericArraySortHelper`1[System.Collections.IntStruct]:PickPivotAndPartition(System.Span`1[System.Collections.IntStruct]):int (Tier1) ; ============================================================ ``` ### ``[System.Private.CoreLib]System.Collections.Generic.GenericArraySortHelper`1[System.Collections.IntStruct].IntroSort(value class System.Span`1,int32)`` ```diff ; optimized using Dynamic PGO ; rsp based frame ; fully interruptible -; with Dynamic PGO: edge weights are invalid, and fgCalledCount is 160118 +; with Dynamic PGO: edge weights are invalid, and fgCalledCount is 140708 ; 14 inlinees with PGO data; 18 single block inlinees; 5 inlinees without PGO data ; Final local variable assignments ; ; V00 arg0 [V00,T12] ( 4, 8 ) byref -> rcx ld-addr-op single-def -; V01 arg1 [V01,T13] ( 6, 6.84) int -> rbx -; V02 loc0 [V02,T11] ( 13, 13.41) int -> rbp -; V03 loc1 [V03,T19] ( 3, 3.88) int -> r14 -; V04 loc2 [V04,T27] ( 7, 0.46) byref -> rax single-def -; V05 loc3 [V05,T32] ( 6, 0.40) byref -> r8 single-def -; V06 loc4 [V06,T28] ( 6, 0.43) byref -> r10 single-def -; V07 loc5 [V07,T21] ( 3, 2.88) int -> rcx +; V01 arg1 [V01,T13] ( 6, 6.82) int -> rbx +; V02 loc0 [V02,T11] ( 13, 13.39) int -> rbp +; V03 loc1 [V03,T19] ( 3, 3.87) int -> r14 +; V04 loc2 [V04,T28] ( 7, 0.46) byref -> rax single-def +; V05 loc3 [V05,T32] ( 6, 0.40) byref -> rdx single-def +; V06 loc4 [V06,T27] ( 8, 0.50) byref -> r8 single-def +; V07 loc5 [V07,T21] ( 3, 2.87) int -> rcx ; V08 OutArgs [V08 ] ( 1, 1 ) struct (32) [rsp+0x00] do-not-enreg[XS] addr-exposed "OutgoingArgSpace" ;* V09 tmp1 [V09 ] ( 0, 0 ) struct (16) zero-ref "spilled call-like call argument" ;* V10 tmp2 [V10 ] ( 0, 0 ) struct (16) zero-ref "spilled call-like call argument" ;* V11 tmp3 [V11 ] ( 0, 0 ) struct (16) zero-ref "spilled call-like call argument" ;* V12 tmp4 [V12 ] ( 0, 0 ) struct (16) zero-ref "spilled call-like call argument" -; V13 tmp5 [V13,T39] ( 5, 0.20) byref -> rsi single-def "Inlining Arg" -; V14 tmp6 [V14,T49] ( 4, 0.15) byref -> rcx single-def "Inlining Arg" -;* V15 tmp7 [V15,T55] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" -;* V16 tmp8 [V16 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" -;* V17 tmp9 [V17,T56] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" -; V18 tmp10 [V18,T52] ( 2, 0.09) byref -> rdx single-def "Inlining Arg" -; V19 tmp11 [V19,T51] ( 3, 0.10) int -> rax "Inlining Arg" -;* V20 tmp12 [V20 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" -;* V21 tmp13 [V21,T42] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" -;* V22 tmp14 [V22 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" -;* V23 tmp15 [V23,T43] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" -; V24 tmp16 [V24,T33] ( 2, 0.35) byref -> rdx single-def "Inlining Arg" -; V25 tmp17 [V25,T29] ( 3, 0.42) int -> rcx "Inlining Arg" -;* V26 tmp18 [V26 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" -;* V27 tmp19 [V27,T44] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" -;* V28 tmp20 [V28 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" -;* V29 tmp21 [V29,T45] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" -; V30 tmp22 [V30,T34] ( 2, 0.35) byref -> rdx single-def "Inlining Arg" -; V31 tmp23 [V31,T30] ( 3, 0.42) int -> rcx "Inlining Arg" -;* V32 tmp24 [V32 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" -;* V33 tmp25 [V33,T40] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" -;* V34 tmp26 [V34 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" -;* V35 tmp27 [V35,T41] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" -; V36 tmp28 [V36,T35] ( 2, 0.35) byref -> r10 single-def "Inlining Arg" -; V37 tmp29 [V37,T31] ( 3, 0.42) int -> rdx "Inlining Arg" -;* V38 tmp30 [V38 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" -;* V39 tmp31 [V39 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "NewObj constructor temp" -;* V40 tmp32 [V40 ] ( 0, 0 ) byref -> zero-ref "Inlining Arg" -; V41 tmp33 [V41,T08] ( 5, 44.73) int -> r8 "Inline stloc first use temp" -;* V42 tmp34 [V42 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "Inlining Arg" -;* V43 tmp35 [V43 ] ( 0, 0 ) struct ( 8) zero-ref ld-addr-op "Inline stloc first use temp" -; V44 tmp36 [V44,T00] ( 8,190.09) int -> r8 "Inline stloc first use temp" -; V45 tmp37 [V45,T03] ( 2,101.37) byref -> r9 "impAppendStmt" +; V13 tmp5 [V13,T36] ( 7, 0.25) byref -> rsi single-def "Inlining Arg" +;* V14 tmp6 [V14 ] ( 0, 0 ) struct ( 8) zero-ref "spilling side-effects" +; V15 tmp7 [V15,T48] ( 4, 0.15) byref -> rcx single-def "Inlining Arg" +;* V16 tmp8 [V16,T53] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" +;* V17 tmp9 [V17 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" +;* V18 tmp10 [V18,T54] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" +;* V19 tmp11 [V19,T56] ( 0, 0 ) byref -> zero-ref single-def "Inlining Arg" +; V20 tmp12 [V20,T50] ( 3, 0.10) int -> rax "Inlining Arg" +;* V21 tmp13 [V21 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" +;* V22 tmp14 [V22 ] ( 0, 0 ) struct ( 8) zero-ref "spilling side-effects" +;* V23 tmp15 [V23,T41] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" +;* V24 tmp16 [V24 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" +;* V25 tmp17 [V25,T42] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" +; V26 tmp18 [V26,T33] ( 2, 0.34) byref -> r10 single-def "Inlining Arg" +; V27 tmp19 [V27,T29] ( 3, 0.41) int -> rcx "Inlining Arg" +;* V28 tmp20 [V28 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" +;* V29 tmp21 [V29 ] ( 0, 0 ) struct ( 8) zero-ref "spilling side-effects" +;* V30 tmp22 [V30,T43] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" +;* V31 tmp23 [V31 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" +;* V32 tmp24 [V32,T44] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" +; V33 tmp25 [V33,T34] ( 2, 0.34) byref -> r10 single-def "Inlining Arg" +; V34 tmp26 [V34,T30] ( 3, 0.41) int -> rcx "Inlining Arg" +;* V35 tmp27 [V35 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" +;* V36 tmp28 [V36 ] ( 0, 0 ) struct ( 8) zero-ref "spilling side-effects" +;* V37 tmp29 [V37,T39] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" +;* V38 tmp30 [V38 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" +;* V39 tmp31 [V39,T40] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" +; V40 tmp32 [V40,T35] ( 2, 0.34) byref -> r8 single-def "Inlining Arg" +; V41 tmp33 [V41,T31] ( 3, 0.41) int -> r10 "Inlining Arg" +;* V42 tmp34 [V42 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" +;* V43 tmp35 [V43 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "NewObj constructor temp" +;* V44 tmp36 [V44 ] ( 0, 0 ) byref -> zero-ref "Inlining Arg" +; V45 tmp37 [V45,T08] ( 5, 44.78) int -> rdx "Inline stloc first use temp" ;* V46 tmp38 [V46 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "Inlining Arg" -;* V47 tmp39 [V47 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "Inlining Arg" -;* V48 tmp40 [V48 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "Inlining Arg" -;* V49 tmp41 [V49 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "Inlining Arg" -;* V50 tmp42 [V50,T06] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" -;* V51 tmp43 [V51 ] ( 0, 0 ) byref -> zero-ref "Inlining Arg" -;* V52 tmp44 [V52 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" -;* V53 tmp45 [V53,T07] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" -; V54 tmp46 [V54,T01] ( 3,156.06) int -> r9 "Inlining Arg" -;* V55 tmp47 [V55 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "Inlining Arg" -;* V56 tmp48 [V56 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "NewObj constructor temp" -;* V57 tmp49 [V57 ] ( 0, 0 ) byref -> zero-ref "Inlining Arg" -;* V58 tmp50 [V58 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "NewObj constructor temp" -;* V59 tmp51 [V59 ] ( 0, 0 ) byref -> zero-ref "Inlining Arg" -; V60 tmp52 [V60,T14] ( 3, 7.75) int -> rbp "Inlining Arg" -;* V61 tmp53 [V61 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "NewObj constructor temp" -;* V62 tmp54 [V62 ] ( 0, 0 ) byref -> zero-ref "Inlining Arg" -; V63 tmp55 [V63,T15] ( 9, 5.07) byref -> rsi single-def "field V00._reference (fldOffset=0x0)" P-INDEP -; V64 tmp56 [V64,T16] ( 8, 5.87) int -> rdi "field V00._length (fldOffset=0x8)" P-INDEP -;* V65 tmp57 [V65 ] ( 0, 0 ) byref -> zero-ref "field V09._reference (fldOffset=0x0)" P-INDEP -;* V66 tmp58 [V66 ] ( 0, 0 ) int -> zero-ref "field V09._length (fldOffset=0x8)" P-INDEP -;* V67 tmp59 [V67 ] ( 0, 0 ) byref -> zero-ref "field V10._reference (fldOffset=0x0)" P-INDEP -;* V68 tmp60 [V68 ] ( 0, 0 ) int -> zero-ref "field V10._length (fldOffset=0x8)" P-INDEP -;* V69 tmp61 [V69 ] ( 0, 0 ) byref -> zero-ref "field V11._reference (fldOffset=0x0)" P-INDEP -;* V70 tmp62 [V70 ] ( 0, 0 ) int -> zero-ref "field V11._length (fldOffset=0x8)" P-INDEP -;* V71 tmp63 [V71 ] ( 0, 0 ) byref -> zero-ref "field V12._reference (fldOffset=0x0)" P-INDEP -;* V72 tmp64 [V72 ] ( 0, 0 ) int -> zero-ref "field V12._length (fldOffset=0x8)" P-INDEP -; V73 tmp65 [V73,T57] ( 2, 0.04) int -> rax "field V16._value (fldOffset=0x0)" P-INDEP -; V74 tmp66 [V74,T58] ( 2, 0.03) int -> rax "field V20._value (fldOffset=0x0)" P-INDEP -; V75 tmp67 [V75,T46] ( 2, 0.17) int -> rcx "field V22._value (fldOffset=0x0)" P-INDEP -; V76 tmp68 [V76,T53] ( 2, 0.08) int -> rdx "field V26._value (fldOffset=0x0)" P-INDEP -; V77 tmp69 [V77,T47] ( 2, 0.17) int -> rcx "field V28._value (fldOffset=0x0)" P-INDEP -; V78 tmp70 [V78,T54] ( 2, 0.08) int -> rdx "field V32._value (fldOffset=0x0)" P-INDEP -; V79 tmp71 [V79,T48] ( 2, 0.17) int -> rdx "field V34._value (fldOffset=0x0)" P-INDEP -; V80 tmp72 [V80,T50] ( 2, 0.12) int -> rcx "field V38._value (fldOffset=0x0)" P-INDEP -; V81 tmp73 [V81,T23] ( 2, 1.74) byref -> rax single-def "field V39._reference (fldOffset=0x0)" P-INDEP -; V82 tmp74 [V82,T24] ( 2, 1.74) int -> rcx "field V39._length (fldOffset=0x8)" P-INDEP -; V83 tmp75 [V83,T02] ( 6,106.00) byref -> rax single-def "field V42._reference (fldOffset=0x0)" P-INDEP -; V84 tmp76 [V84,T25] ( 2, 1.74) int -> rcx "field V42._length (fldOffset=0x8)" P-INDEP -; V85 tmp77 [V85,T04] ( 4, 67.44) int -> r10 "field V43._value (fldOffset=0x0)" P-INDEP -;* V86 tmp78 [V86 ] ( 0, 0 ) byref -> zero-ref "field V46._reference (fldOffset=0x0)" P-INDEP -;* V87 tmp79 [V87 ] ( 0, 0 ) int -> zero-ref "field V46._length (fldOffset=0x8)" P-INDEP -;* V88 tmp80 [V88 ] ( 0, 0 ) byref -> zero-ref "field V47._reference (fldOffset=0x0)" P-INDEP -;* V89 tmp81 [V89 ] ( 0, 0 ) int -> zero-ref "field V47._length (fldOffset=0x8)" P-INDEP -;* V90 tmp82 [V90 ] ( 0, 0 ) byref -> zero-ref "field V48._reference (fldOffset=0x0)" P-INDEP -;* V91 tmp83 [V91 ] ( 0, 0 ) int -> zero-ref "field V48._length (fldOffset=0x8)" P-INDEP -;* V92 tmp84 [V92 ] ( 0, 0 ) byref -> zero-ref "field V49._reference (fldOffset=0x0)" P-INDEP -;* V93 tmp85 [V93 ] ( 0, 0 ) int -> zero-ref "field V49._length (fldOffset=0x8)" P-INDEP -; V94 tmp86 [V94,T05] ( 2, 65.03) int -> r9 "field V52._value (fldOffset=0x0)" P-INDEP -;* V95 tmp87 [V95 ] ( 0, 0 ) byref -> zero-ref "field V55._reference (fldOffset=0x0)" P-INDEP -;* V96 tmp88 [V96 ] ( 0, 0 ) int -> zero-ref "field V55._length (fldOffset=0x8)" P-INDEP -; V97 tmp89 [V97,T59] ( 2, 0 ) byref -> rcx single-def "field V56._reference (fldOffset=0x0)" P-INDEP -; V98 tmp90 [V98,T60] ( 2, 0 ) int -> rax "field V56._length (fldOffset=0x8)" P-INDEP -;* V99 tmp91 [V99,T26] ( 0, 0 ) byref -> zero-ref "field V58._reference (fldOffset=0x0)" P-INDEP -; V100 tmp92 [V100,T22] ( 2, 1.92) int -> rcx "field V58._length (fldOffset=0x8)" P-INDEP -; V101 tmp93 [V101,T17] ( 2, 3.92) byref -> rcx "field V61._reference (fldOffset=0x0)" P-INDEP -; V102 tmp94 [V102,T18] ( 2, 3.92) int -> rbp "field V61._length (fldOffset=0x8)" P-INDEP -;* V103 tmp95 [V103 ] ( 0, 0 ) struct (16) zero-ref "Promoted implicit byref" -; V104 tmp96 [V104 ] ( 9, 17.51) struct (16) [rsp+0x20] do-not-enreg[XSF] must-init addr-exposed "by-value struct argument" -; V105 cse0 [V105,T36] ( 8, 0.31) int -> rdx "CSE - conservative" -; V106 cse1 [V106,T37] ( 4, 0.27) int -> rcx "CSE - conservative" -; V107 cse2 [V107,T38] ( 4, 0.25) int -> rdx "CSE - conservative" -; V108 cse3 [V108,T10] ( 3, 14.73) int -> rcx "CSE - moderate" -; V109 cse4 [V109,T09] ( 3, 30.87) int -> rdx "CSE - aggressive" -; V110 cse5 [V110,T20] ( 3, 3.88) long -> rcx "CSE - moderate" +;* V47 tmp39 [V47 ] ( 0, 0 ) struct ( 8) zero-ref ld-addr-op "Inline stloc first use temp" +; V48 tmp40 [V48,T00] ( 8,187.48) int -> rdx "Inline stloc first use temp" +; V49 tmp41 [V49,T03] ( 2, 98.91) byref -> r9 "impAppendStmt" +;* V50 tmp42 [V50 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "Inlining Arg" +;* V51 tmp43 [V51 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "Inlining Arg" +;* V52 tmp44 [V52 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "Inlining Arg" +;* V53 tmp45 [V53 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "Inlining Arg" +;* V54 tmp46 [V54,T06] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" +;* V55 tmp47 [V55 ] ( 0, 0 ) byref -> zero-ref "Inlining Arg" +;* V56 tmp48 [V56 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" +;* V57 tmp49 [V57,T07] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" +; V58 tmp50 [V58,T01] ( 3,155.78) int -> r9 "Inlining Arg" +;* V59 tmp51 [V59 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "Inlining Arg" +;* V60 tmp52 [V60 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "NewObj constructor temp" +;* V61 tmp53 [V61 ] ( 0, 0 ) byref -> zero-ref "Inlining Arg" +;* V62 tmp54 [V62 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "NewObj constructor temp" +;* V63 tmp55 [V63 ] ( 0, 0 ) byref -> zero-ref "Inlining Arg" +; V64 tmp56 [V64,T14] ( 3, 7.74) int -> rbp "Inlining Arg" +;* V65 tmp57 [V65 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "NewObj constructor temp" +;* V66 tmp58 [V66 ] ( 0, 0 ) byref -> zero-ref "Inlining Arg" +; V67 tmp59 [V67,T15] ( 9, 5.06) byref -> rsi single-def "field V00._reference (fldOffset=0x0)" P-INDEP +; V68 tmp60 [V68,T16] ( 8, 5.87) int -> rdi "field V00._length (fldOffset=0x8)" P-INDEP +;* V69 tmp61 [V69 ] ( 0, 0 ) byref -> zero-ref "field V09._reference (fldOffset=0x0)" P-INDEP +;* V70 tmp62 [V70 ] ( 0, 0 ) int -> zero-ref "field V09._length (fldOffset=0x8)" P-INDEP +;* V71 tmp63 [V71 ] ( 0, 0 ) byref -> zero-ref "field V10._reference (fldOffset=0x0)" P-INDEP +;* V72 tmp64 [V72 ] ( 0, 0 ) int -> zero-ref "field V10._length (fldOffset=0x8)" P-INDEP +;* V73 tmp65 [V73 ] ( 0, 0 ) byref -> zero-ref "field V11._reference (fldOffset=0x0)" P-INDEP +;* V74 tmp66 [V74 ] ( 0, 0 ) int -> zero-ref "field V11._length (fldOffset=0x8)" P-INDEP +;* V75 tmp67 [V75 ] ( 0, 0 ) byref -> zero-ref "field V12._reference (fldOffset=0x0)" P-INDEP +;* V76 tmp68 [V76 ] ( 0, 0 ) int -> zero-ref "field V12._length (fldOffset=0x8)" P-INDEP +;* V77 tmp69 [V77 ] ( 0, 0 ) int -> zero-ref "field V14._value (fldOffset=0x0)" P-INDEP +; V78 tmp70 [V78,T55] ( 2, 0.04) int -> rax "field V17._value (fldOffset=0x0)" P-INDEP +; V79 tmp71 [V79,T57] ( 2, 0.03) int -> rax "field V21._value (fldOffset=0x0)" P-INDEP +;* V80 tmp72 [V80 ] ( 0, 0 ) int -> zero-ref "field V22._value (fldOffset=0x0)" P-INDEP +; V81 tmp73 [V81,T45] ( 2, 0.17) int -> rcx "field V24._value (fldOffset=0x0)" P-INDEP +; V82 tmp74 [V82,T51] ( 2, 0.08) int -> rcx "field V28._value (fldOffset=0x0)" P-INDEP +;* V83 tmp75 [V83 ] ( 0, 0 ) int -> zero-ref "field V29._value (fldOffset=0x0)" P-INDEP +; V84 tmp76 [V84,T46] ( 2, 0.17) int -> rcx "field V31._value (fldOffset=0x0)" P-INDEP +; V85 tmp77 [V85,T52] ( 2, 0.08) int -> r10 "field V35._value (fldOffset=0x0)" P-INDEP +;* V86 tmp78 [V86 ] ( 0, 0 ) int -> zero-ref "field V36._value (fldOffset=0x0)" P-INDEP +; V87 tmp79 [V87,T47] ( 2, 0.17) int -> r10 "field V38._value (fldOffset=0x0)" P-INDEP +; V88 tmp80 [V88,T49] ( 2, 0.12) int -> rcx "field V42._value (fldOffset=0x0)" P-INDEP +; V89 tmp81 [V89,T23] ( 2, 1.74) byref -> rax single-def "field V43._reference (fldOffset=0x0)" P-INDEP +; V90 tmp82 [V90,T24] ( 2, 1.74) int -> rcx "field V43._length (fldOffset=0x8)" P-INDEP +; V91 tmp83 [V91,T02] ( 6,104.69) byref -> rax single-def "field V46._reference (fldOffset=0x0)" P-INDEP +; V92 tmp84 [V92,T25] ( 2, 1.74) int -> rcx "field V46._length (fldOffset=0x8)" P-INDEP +; V93 tmp85 [V93,T04] ( 4, 67.43) int -> r10 "field V47._value (fldOffset=0x0)" P-INDEP +;* V94 tmp86 [V94 ] ( 0, 0 ) byref -> zero-ref "field V50._reference (fldOffset=0x0)" P-INDEP +;* V95 tmp87 [V95 ] ( 0, 0 ) int -> zero-ref "field V50._length (fldOffset=0x8)" P-INDEP +;* V96 tmp88 [V96 ] ( 0, 0 ) byref -> zero-ref "field V51._reference (fldOffset=0x0)" P-INDEP +;* V97 tmp89 [V97 ] ( 0, 0 ) int -> zero-ref "field V51._length (fldOffset=0x8)" P-INDEP +;* V98 tmp90 [V98 ] ( 0, 0 ) byref -> zero-ref "field V52._reference (fldOffset=0x0)" P-INDEP +;* V99 tmp91 [V99 ] ( 0, 0 ) int -> zero-ref "field V52._length (fldOffset=0x8)" P-INDEP +;* V100 tmp92 [V100 ] ( 0, 0 ) byref -> zero-ref "field V53._reference (fldOffset=0x0)" P-INDEP +;* V101 tmp93 [V101 ] ( 0, 0 ) int -> zero-ref "field V53._length (fldOffset=0x8)" P-INDEP +; V102 tmp94 [V102,T05] ( 2, 64.82) int -> r9 "field V56._value (fldOffset=0x0)" P-INDEP +;* V103 tmp95 [V103 ] ( 0, 0 ) byref -> zero-ref "field V59._reference (fldOffset=0x0)" P-INDEP +;* V104 tmp96 [V104 ] ( 0, 0 ) int -> zero-ref "field V59._length (fldOffset=0x8)" P-INDEP +; V105 tmp97 [V105,T58] ( 2, 0 ) byref -> rcx single-def "field V60._reference (fldOffset=0x0)" P-INDEP +; V106 tmp98 [V106,T59] ( 2, 0 ) int -> rax "field V60._length (fldOffset=0x8)" P-INDEP +;* V107 tmp99 [V107,T26] ( 0, 0 ) byref -> zero-ref "field V62._reference (fldOffset=0x0)" P-INDEP +; V108 tmp100 [V108,T22] ( 2, 1.91) int -> rcx "field V62._length (fldOffset=0x8)" P-INDEP +; V109 tmp101 [V109,T17] ( 2, 3.91) byref -> rcx "field V65._reference (fldOffset=0x0)" P-INDEP +; V110 tmp102 [V110,T18] ( 2, 3.91) int -> rbp "field V65._length (fldOffset=0x8)" P-INDEP +;* V111 tmp103 [V111 ] ( 0, 0 ) struct (16) zero-ref "Promoted implicit byref" +; V112 tmp104 [V112 ] ( 9, 17.47) struct (16) [rsp+0x20] do-not-enreg[XSF] must-init addr-exposed "by-value struct argument" +; V113 cse0 [V113,T37] ( 4, 0.27) int -> rcx "CSE - conservative" +; V114 cse1 [V114,T38] ( 4, 0.25) int -> r10 "CSE - conservative" +; V115 cse2 [V115,T10] ( 3, 14.80) int -> rcx "CSE - moderate" +; V116 cse3 [V116,T09] ( 3, 30.85) int -> r8 "CSE - aggressive" +; V117 cse4 [V117,T20] ( 3, 3.87) long -> rcx "CSE - moderate" ; ; Lcl frame size = 48 @@ -810,7 +817,7 @@ G_M61030_IG02: G_M61030_IG03: cmp ebp, 16 jle SHORT G_M61030_IG07 - ;; size=5 bbWeight=1.94 PerfScore 2.42 + ;; size=5 bbWeight=1.93 PerfScore 2.42 G_M61030_IG04: test ebx, ebx je G_M61030_IG31 @@ -830,7 +837,7 @@ G_M61030_IG04: mov eax, edi cmp rdx, rax ja G_M61030_IG32 - ;; size=65 bbWeight=0.96 PerfScore 11.03 + ;; size=65 bbWeight=0.96 PerfScore 10.99 G_M61030_IG05: lea rcx, bword ptr [rsi+4*rcx] mov bword ptr [rsp+0x20], rcx @@ -841,7 +848,7 @@ G_M61030_IG05: mov ebp, r14d cmp ebp, 1 jg SHORT G_M61030_IG03 - ;; size=34 bbWeight=1.96 PerfScore 15.18 + ;; size=34 bbWeight=1.96 PerfScore 15.16 G_M61030_IG06: jmp G_M61030_IG20 align [0 bytes for IG11] @@ -859,47 +866,47 @@ G_M61030_IG09: ja G_M61030_IG32 mov rax, rsi mov ecx, ebp - xor r8d, r8d + xor edx, edx dec ecx test ecx, ecx jle SHORT G_M61030_IG20 - ;; size=22 bbWeight=0.87 PerfScore 3.05 + ;; size=21 bbWeight=0.87 PerfScore 3.05 G_M61030_IG10: - lea edx, [r8+0x01] - movsxd r10, edx + lea r8d, [rdx+0x01] + movsxd r10, r8d mov r10d, dword ptr [rax+4*r10] jmp SHORT G_M61030_IG12 - ;; size=13 bbWeight=8.94 PerfScore 42.45 + ;; size=13 bbWeight=8.89 PerfScore 42.24 G_M61030_IG11: - lea r9d, [r8+0x01] + lea r9d, [rdx+0x01] movsxd r9, r9d lea r9, bword ptr [rax+4*r9] - movsxd r11, r8d + movsxd r11, edx mov r11d, dword ptr [rax+4*r11] mov dword ptr [r9], r11d - dec r8d - ;; size=24 bbWeight=25.34 PerfScore 120.37 + dec edx + ;; size=23 bbWeight=24.73 PerfScore 117.46 G_M61030_IG12: - test r8d, r8d + test edx, edx jl SHORT G_M61030_IG15 - ;; size=5 bbWeight=34.28 PerfScore 42.85 + ;; size=4 bbWeight=34.21 PerfScore 42.76 G_M61030_IG13: - movsxd r9, r8d + movsxd r9, edx mov r9d, dword ptr [rax+4*r9] cmp r10d, r9d jl SHORT G_M61030_IG11 - ;; size=12 bbWeight=32.52 PerfScore 113.81 + ;; size=12 bbWeight=32.41 PerfScore 113.44 G_M61030_IG14: cmp r10d, r9d - ;; size=3 bbWeight=13.00 PerfScore 3.25 + ;; size=3 bbWeight=13.07 PerfScore 3.27 G_M61030_IG15: - inc r8d - movsxd r8, r8d - mov dword ptr [rax+4*r8], r10d - mov r8d, edx - cmp r8d, ecx + inc edx + movsxd rdx, edx + mov dword ptr [rax+4*rdx], r10d + mov edx, r8d + cmp edx, ecx jl SHORT G_M61030_IG10 - ;; size=18 bbWeight=12.99 PerfScore 38.97 + ;; size=16 bbWeight=13.06 PerfScore 39.18 G_M61030_IG16: jmp SHORT G_M61030_IG20 ;; size=2 bbWeight=0.79 PerfScore 1.59 @@ -907,21 +914,19 @@ G_M61030_IG17: lea rcx, bword ptr [rsi+0x04] cmp byte ptr [rsi], sil mov eax, dword ptr [rcx] - mov rdx, rsi - mov edx, dword ptr [rdx] - cmp edx, eax + cmp dword ptr [rsi], eax jl SHORT G_M61030_IG20 - ;; size=18 bbWeight=0.02 PerfScore 0.19 + ;; size=13 bbWeight=0.02 PerfScore 0.21 G_M61030_IG18: - cmp edx, eax + cmp dword ptr [rsi], eax jle SHORT G_M61030_IG20 - ;; size=4 bbWeight=0.01 PerfScore 0.01 + ;; size=4 bbWeight=0.01 PerfScore 0.03 G_M61030_IG19: - mov eax, edx + mov eax, dword ptr [rsi] mov edx, dword ptr [rcx] mov dword ptr [rsi], edx mov dword ptr [rcx], eax - ;; size=8 bbWeight=0.02 PerfScore 0.06 + ;; size=8 bbWeight=0.02 PerfScore 0.09 G_M61030_IG20: add rsp, 48 pop rbx @@ -935,61 +940,61 @@ G_M61030_IG21: cmp edi, 2 jbe G_M61030_IG33 lea rax, bword ptr [rsi+0x08] - lea r8, bword ptr [rsi+0x04] - mov r10, rsi - cmp byte ptr [r10], r10b - mov ecx, dword ptr [r8] - mov rdx, r10 - mov edx, dword ptr [rdx] - cmp edx, ecx + lea rdx, bword ptr [rsi+0x04] + mov r8, rsi + cmp byte ptr [r8], r8b + mov ecx, dword ptr [rdx] + mov r10, r8 + cmp dword ptr [r10], ecx jl SHORT G_M61030_IG23 - ;; size=35 bbWeight=0.09 PerfScore 0.95 + ;; size=33 bbWeight=0.09 PerfScore 1.01 G_M61030_IG22: - cmp edx, ecx + cmp dword ptr [r8], ecx jg SHORT G_M61030_IG26 - ;; size=4 bbWeight=0.03 PerfScore 0.04 + ;; size=5 bbWeight=0.03 PerfScore 0.14 G_M61030_IG23: mov ecx, dword ptr [rax] - mov rdx, r10 - mov edx, dword ptr [rdx] - cmp edx, ecx + mov r10, r8 + mov r10d, dword ptr [r10] + cmp r10d, ecx jl SHORT G_M61030_IG28 - ;; size=11 bbWeight=0.09 PerfScore 0.48 + ;; size=13 bbWeight=0.09 PerfScore 0.47 G_M61030_IG24: - cmp edx, ecx + cmp r10d, ecx jle SHORT G_M61030_IG28 - ;; size=4 bbWeight=0.03 PerfScore 0.04 + ;; size=5 bbWeight=0.03 PerfScore 0.04 G_M61030_IG25: jmp SHORT G_M61030_IG27 ;; size=2 bbWeight=0.03 PerfScore 0.07 G_M61030_IG26: mov ecx, dword ptr [r8] - mov dword ptr [r10], ecx - mov dword ptr [r8], edx + mov r10d, dword ptr [rdx] + mov dword ptr [r8], r10d + mov dword ptr [rdx], ecx jmp SHORT G_M61030_IG23 - ;; size=11 bbWeight=0.04 PerfScore 0.24 + ;; size=13 bbWeight=0.04 PerfScore 0.32 G_M61030_IG27: mov ecx, dword ptr [rax] - mov dword ptr [r10], ecx - mov dword ptr [rax], edx - ;; size=7 bbWeight=0.04 PerfScore 0.16 + mov dword ptr [r8], ecx + mov dword ptr [rax], r10d + ;; size=8 bbWeight=0.04 PerfScore 0.16 G_M61030_IG28: - mov edx, dword ptr [rax] - mov r10, r8 - mov ecx, dword ptr [r10] - cmp ecx, edx + mov r10d, dword ptr [rax] + mov r8, rdx + mov ecx, dword ptr [r8] + cmp ecx, r10d jl SHORT G_M61030_IG20 - ;; size=12 bbWeight=0.09 PerfScore 0.48 + ;; size=14 bbWeight=0.09 PerfScore 0.47 G_M61030_IG29: - cmp ecx, edx + cmp ecx, r10d jle SHORT G_M61030_IG20 - ;; size=4 bbWeight=0.03 PerfScore 0.04 + ;; size=5 bbWeight=0.03 PerfScore 0.04 G_M61030_IG30: - mov edx, dword ptr [rax] - mov dword ptr [r8], edx + mov r8d, dword ptr [rax] + mov dword ptr [rdx], r8d mov dword ptr [rax], ecx jmp SHORT G_M61030_IG20 - ;; size=9 bbWeight=0.06 PerfScore 0.36 + ;; size=10 bbWeight=0.06 PerfScore 0.36 G_M61030_IG31: cmp ebp, edi ja SHORT G_M61030_IG32 @@ -1010,6 +1015,6 @@ G_M61030_IG33: int3 ;; size=6 bbWeight=0 PerfScore 0.00 -; Total bytes of code 445, prolog size 19, PerfScore 462.03, instruction count 154, allocated bytes for code 445 (MethodHash=a0751199) for method System.Collections.Generic.GenericArraySortHelper`1[System.Collections.IntStruct]:IntroSort(System.Span`1[System.Collections.IntStruct],int) (Tier1) +; Total bytes of code 444, prolog size 19, PerfScore 458.80, instruction count 152, allocated bytes for code 444 (MethodHash=a0751199) for method System.Collections.Generic.GenericArraySortHelper`1[System.Collections.IntStruct]:IntroSort(System.Span`1[System.Collections.IntStruct],int) (Tier1) ; ============================================================ ```

System.Collections.Sort<IntStruct>.List(Size: 512)

Hot functions:

Diffs ### ``[System.Private.CoreLib]System.Collections.Generic.GenericArraySortHelper`1[System.Collections.IntStruct].PickPivotAndPartition(value class System.Span`1)`` ```diff ; optimized using Dynamic PGO ; rsp based frame ; fully interruptible -; with Dynamic PGO: edge weights are invalid, and fgCalledCount is 131872 +; with Dynamic PGO: edge weights are invalid, and fgCalledCount is 156096 ; 8 inlinees with PGO data; 12 single block inlinees; 5 inlinees without PGO data ; Final local variable assignments ; ; V00 arg0 [V00,T16] ( 4, 8 ) byref -> rcx ld-addr-op single-def -; V01 loc0 [V01,T13] ( 17, 49.30) byref -> rdx single-def -; V02 loc1 [V02,T19] ( 7, 4.92) byref -> r8 single-def -; V03 loc2 [V03,T17] ( 10, 7.35) byref -> r10 single-def -; V04 loc3 [V04,T14] ( 9, 41.07) byref -> rcx single-def +; V01 loc0 [V01,T13] ( 17, 48.99) byref -> rdx single-def +; V02 loc1 [V02,T19] ( 7, 4.82) byref -> r8 single-def +; V03 loc2 [V03,T17] ( 10, 7.21) byref -> r10 single-def +; V04 loc3 [V04,T14] ( 9, 40.71) byref -> rcx single-def ;* V05 loc4 [V05 ] ( 0, 0 ) struct ( 8) zero-ref ld-addr-op -; V06 loc5 [V06,T01] ( 12,181.14) byref -> rax -; V07 loc6 [V07,T00] ( 8,191.16) byref -> r10 +; V06 loc5 [V06,T01] ( 12,179.55) byref -> rax +; V07 loc6 [V07,T00] ( 8,191.13) byref -> r10 ;# V08 OutArgs [V08 ] ( 1, 1 ) struct ( 0) [rsp+0x00] do-not-enreg[XS] addr-exposed "OutgoingArgSpace" -; V09 tmp1 [V09,T04] ( 2,145.98) byref -> r10 "dup spill" -; V10 tmp2 [V10,T05] ( 2,131.98) byref -> rax "dup spill" +; V09 tmp1 [V09,T04] ( 2,145.55) byref -> r10 "dup spill" +; V10 tmp2 [V10,T05] ( 2,129.68) byref -> rax "dup spill" ;* V11 tmp3 [V11 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "Inlining Arg" -;* V12 tmp4 [V12,T29] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" -;* V13 tmp5 [V13 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" -;* V14 tmp6 [V14,T30] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" -; V15 tmp7 [V15,T23] ( 2, 4.06) byref -> r9 single-def "Inlining Arg" -; V16 tmp8 [V16,T20] ( 3, 4.86) int -> rax "Inlining Arg" -;* V17 tmp9 [V17 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" -;* V18 tmp10 [V18,T36] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" -;* V19 tmp11 [V19 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" -;* V20 tmp12 [V20,T37] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" -; V21 tmp13 [V21,T24] ( 2, 4.06) byref -> r9 single-def "Inlining Arg" -; V22 tmp14 [V22,T21] ( 3, 4.86) int -> rax "Inlining Arg" -;* V23 tmp15 [V23 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" -;* V24 tmp16 [V24,T31] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" -;* V25 tmp17 [V25 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" -;* V26 tmp18 [V26,T32] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" -; V27 tmp19 [V27,T25] ( 2, 4.06) byref -> r9 single-def "Inlining Arg" -; V28 tmp20 [V28,T22] ( 3, 4.86) int -> rax "Inlining Arg" -;* V29 tmp21 [V29 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" -;* V30 tmp22 [V30 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" -;* V31 tmp23 [V31,T11] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" -;* V32 tmp24 [V32 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" -;* V33 tmp25 [V33,T12] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" -; V34 tmp26 [V34,T03] ( 3,157.82) int -> r8 "Inlining Arg" -;* V35 tmp27 [V35,T08] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" -;* V36 tmp28 [V36 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" -;* V37 tmp29 [V37,T09] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" -; V38 tmp30 [V38,T02] ( 3,174.57) int -> r8 "Inlining Arg" -;* V39 tmp31 [V39 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" -;* V40 tmp32 [V40 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" -; V41 tmp33 [V41,T38] ( 2, 2.02) byref -> rdx single-def "field V00._reference (fldOffset=0x0)" P-INDEP -; V42 tmp34 [V42,T27] ( 3, 3.03) int -> rcx "field V00._length (fldOffset=0x8)" P-INDEP -; V43 tmp35 [V43,T06] ( 5, 97.72) int -> r9 "field V05._value (fldOffset=0x0)" P-INDEP -;* V44 tmp36 [V44 ] ( 0, 0 ) byref -> zero-ref single-def "field V11._reference (fldOffset=0x0)" P-INDEP -;* V45 tmp37 [V45 ] ( 0, 0 ) int -> zero-ref "field V11._length (fldOffset=0x8)" P-INDEP -; V46 tmp38 [V46,T33] ( 2, 2.03) int -> rax "field V13._value (fldOffset=0x0)" P-INDEP -; V47 tmp39 [V47,T41] ( 2, 0.93) int -> rax "field V17._value (fldOffset=0x0)" P-INDEP -; V48 tmp40 [V48,T34] ( 2, 2.03) int -> rax "field V19._value (fldOffset=0x0)" P-INDEP -; V49 tmp41 [V49,T42] ( 2, 0.93) int -> rax "field V23._value (fldOffset=0x0)" P-INDEP -; V50 tmp42 [V50,T35] ( 2, 2.03) int -> rax "field V25._value (fldOffset=0x0)" P-INDEP -; V51 tmp43 [V51,T43] ( 2, 0.93) int -> rax "field V29._value (fldOffset=0x0)" P-INDEP -; V52 tmp44 [V52,T39] ( 2, 2.03) int -> rax "field V30._value (fldOffset=0x0)" P-INDEP -; V53 tmp45 [V53,T10] ( 2, 65.99) int -> r8 "field V32._value (fldOffset=0x0)" P-INDEP -; V54 tmp46 [V54,T07] ( 2, 72.99) int -> r8 "field V36._value (fldOffset=0x0)" P-INDEP -; V55 tmp47 [V55,T15] ( 2, 29.44) int -> r8 "field V39._value (fldOffset=0x0)" P-INDEP -; V56 tmp48 [V56,T40] ( 2, 1.96) int -> r10 "field V40._value (fldOffset=0x0)" P-INDEP -;* V57 tmp49 [V57 ] ( 0, 0 ) struct (16) zero-ref "Promoted implicit byref" -; V58 cse0 [V58,T26] ( 3, 3.05) int -> rax "CSE - moderate" -; V59 cse1 [V59,T28] ( 3, 3.05) int -> rax "CSE - moderate" -; V60 rat0 [V60,T18] ( 3, 6 ) long -> rax "ReplaceWithLclVar is creating a new local variable" +;* V12 tmp4 [V12 ] ( 0, 0 ) struct ( 8) zero-ref "spilling side-effects" +;* V13 tmp5 [V13,T28] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" +;* V14 tmp6 [V14 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" +;* V15 tmp7 [V15,T29] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" +;* V16 tmp8 [V16,T37] ( 0, 0 ) byref -> zero-ref single-def "Inlining Arg" +; V17 tmp9 [V17,T20] ( 3, 4.78) int -> rax "Inlining Arg" +;* V18 tmp10 [V18 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" +;* V19 tmp11 [V19 ] ( 0, 0 ) struct ( 8) zero-ref "spilling side-effects" +;* V20 tmp12 [V20,T38] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" +;* V21 tmp13 [V21 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" +;* V22 tmp14 [V22,T39] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" +; V23 tmp15 [V23,T23] ( 2, 4 ) byref -> r9 single-def "Inlining Arg" +; V24 tmp16 [V24,T21] ( 3, 4.78) int -> rax "Inlining Arg" +;* V25 tmp17 [V25 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" +;* V26 tmp18 [V26 ] ( 0, 0 ) struct ( 8) zero-ref "spilling side-effects" +;* V27 tmp19 [V27,T30] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" +;* V28 tmp20 [V28 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" +;* V29 tmp21 [V29,T31] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" +; V30 tmp22 [V30,T24] ( 2, 4 ) byref -> r9 single-def "Inlining Arg" +; V31 tmp23 [V31,T22] ( 3, 4.78) int -> rax "Inlining Arg" +;* V32 tmp24 [V32 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" +;* V33 tmp25 [V33 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" +;* V34 tmp26 [V34,T11] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" +;* V35 tmp27 [V35 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" +;* V36 tmp28 [V36,T12] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" +; V37 tmp29 [V37,T03] ( 3,155.23) int -> r8 "Inlining Arg" +;* V38 tmp30 [V38,T08] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" +;* V39 tmp31 [V39 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" +;* V40 tmp32 [V40,T09] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" +; V41 tmp33 [V41,T02] ( 3,174.23) int -> r8 "Inlining Arg" +;* V42 tmp34 [V42 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" +;* V43 tmp35 [V43 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" +; V44 tmp36 [V44,T32] ( 2, 2 ) byref -> rdx single-def "field V00._reference (fldOffset=0x0)" P-INDEP +; V45 tmp37 [V45,T25] ( 3, 3 ) int -> rcx "field V00._length (fldOffset=0x8)" P-INDEP +; V46 tmp38 [V46,T06] ( 5, 96.92) int -> r9 "field V05._value (fldOffset=0x0)" P-INDEP +;* V47 tmp39 [V47 ] ( 0, 0 ) byref -> zero-ref single-def "field V11._reference (fldOffset=0x0)" P-INDEP +;* V48 tmp40 [V48 ] ( 0, 0 ) int -> zero-ref "field V11._length (fldOffset=0x8)" P-INDEP +;* V49 tmp41 [V49 ] ( 0, 0 ) int -> zero-ref "field V12._value (fldOffset=0x0)" P-INDEP +; V50 tmp42 [V50,T33] ( 2, 2 ) int -> rax "field V14._value (fldOffset=0x0)" P-INDEP +; V51 tmp43 [V51,T41] ( 2, 0.91) int -> rax "field V18._value (fldOffset=0x0)" P-INDEP +;* V52 tmp44 [V52 ] ( 0, 0 ) int -> zero-ref "field V19._value (fldOffset=0x0)" P-INDEP +; V53 tmp45 [V53,T34] ( 2, 2 ) int -> rax "field V21._value (fldOffset=0x0)" P-INDEP +; V54 tmp46 [V54,T42] ( 2, 0.91) int -> rax "field V25._value (fldOffset=0x0)" P-INDEP +;* V55 tmp47 [V55 ] ( 0, 0 ) int -> zero-ref "field V26._value (fldOffset=0x0)" P-INDEP +; V56 tmp48 [V56,T35] ( 2, 2 ) int -> rax "field V28._value (fldOffset=0x0)" P-INDEP +; V57 tmp49 [V57,T43] ( 2, 0.91) int -> rax "field V32._value (fldOffset=0x0)" P-INDEP +; V58 tmp50 [V58,T36] ( 2, 2 ) int -> rax "field V33._value (fldOffset=0x0)" P-INDEP +; V59 tmp51 [V59,T10] ( 2, 64.84) int -> r8 "field V35._value (fldOffset=0x0)" P-INDEP +; V60 tmp52 [V60,T07] ( 2, 72.78) int -> r8 "field V39._value (fldOffset=0x0)" P-INDEP +; V61 tmp53 [V61,T15] ( 2, 29.73) int -> r8 "field V42._value (fldOffset=0x0)" P-INDEP +; V62 tmp54 [V62,T40] ( 2, 1.94) int -> r10 "field V43._value (fldOffset=0x0)" P-INDEP +;* V63 tmp55 [V63 ] ( 0, 0 ) struct (16) zero-ref "Promoted implicit byref" +; V64 cse0 [V64,T26] ( 3, 3 ) int -> rax "CSE - moderate" +; V65 cse1 [V65,T27] ( 3, 3 ) int -> rax "CSE - moderate" +; V66 rat0 [V66,T18] ( 3, 6 ) long -> rax "ReplaceWithLclVar is creating a new local variable" ; ; Lcl frame size = 0 @@ -562,34 +568,33 @@ G_M50248_IG02: lea r10, bword ptr [rdx+4*rax] cmp byte ptr [rdx], dl mov eax, dword ptr [r10] - mov r9, rdx - cmp dword ptr [r9], eax + cmp dword ptr [rdx], eax jl SHORT G_M50248_IG04 - ;; size=37 bbWeight=1 PerfScore 15.75 + ;; size=33 bbWeight=1 PerfScore 15.50 G_M50248_IG03: cmp dword ptr [rdx], eax jg SHORT G_M50248_IG07 - ;; size=4 bbWeight=0.40 PerfScore 1.59 + ;; size=4 bbWeight=0.39 PerfScore 1.56 G_M50248_IG04: mov eax, dword ptr [r8] mov r9, rdx cmp dword ptr [r9], eax jl SHORT G_M50248_IG11 - ;; size=11 bbWeight=1.02 PerfScore 6.35 + ;; size=11 bbWeight=1 PerfScore 6.25 G_M50248_IG05: cmp dword ptr [rdx], eax jle SHORT G_M50248_IG11 - ;; size=4 bbWeight=0.40 PerfScore 1.59 + ;; size=4 bbWeight=0.39 PerfScore 1.56 G_M50248_IG06: jmp SHORT G_M50248_IG18 - ;; size=2 bbWeight=0.40 PerfScore 0.80 + ;; size=2 bbWeight=0.39 PerfScore 0.78 G_M50248_IG07: mov eax, dword ptr [rdx] mov r9d, dword ptr [r10] mov dword ptr [rdx], r9d mov dword ptr [r10], eax jmp SHORT G_M50248_IG04 - ;; size=13 bbWeight=0.47 PerfScore 3.73 + ;; size=13 bbWeight=0.46 PerfScore 3.65 G_M50248_IG08: cmp r10, rdx jbe SHORT G_M50248_IG20 @@ -597,24 +602,24 @@ G_M50248_IG08: mov r8d, dword ptr [r10] cmp r9d, r8d jl SHORT G_M50248_IG08 - ;; size=17 bbWeight=36.50 PerfScore 173.36 + ;; size=17 bbWeight=36.39 PerfScore 172.85 G_M50248_IG09: cmp r9d, r8d jle SHORT G_M50248_IG20 - ;; size=5 bbWeight=14.29 PerfScore 17.87 + ;; size=5 bbWeight=14.34 PerfScore 17.92 G_M50248_IG10: jmp SHORT G_M50248_IG20 - ;; size=2 bbWeight=14.29 PerfScore 28.57 + ;; size=2 bbWeight=14.33 PerfScore 28.66 G_M50248_IG11: mov eax, dword ptr [r8] mov r9, r10 cmp dword ptr [r9], eax jl SHORT G_M50248_IG13 - ;; size=11 bbWeight=1.02 PerfScore 6.35 + ;; size=11 bbWeight=1 PerfScore 6.25 G_M50248_IG12: cmp dword ptr [r10], eax jg SHORT G_M50248_IG19 - ;; size=5 bbWeight=0.40 PerfScore 1.59 + ;; size=5 bbWeight=0.39 PerfScore 1.56 G_M50248_IG13: add ecx, -2 movsxd rax, ecx @@ -628,49 +633,49 @@ G_M50248_IG13: mov r10, rcx cmp rdx, rcx jae SHORT G_M50248_IG22 - ;; size=35 bbWeight=1.02 PerfScore 9.14 + ;; size=35 bbWeight=1 PerfScore 9.00 G_M50248_IG14: cmp rax, rcx jae SHORT G_M50248_IG08 - ;; size=5 bbWeight=33.03 PerfScore 41.29 + ;; size=5 bbWeight=32.77 PerfScore 40.97 G_M50248_IG15: add rax, 4 mov r8d, dword ptr [rax] cmp r9d, r8d jl SHORT G_M50248_IG08 - ;; size=12 bbWeight=32.99 PerfScore 115.48 + ;; size=12 bbWeight=32.42 PerfScore 113.47 G_M50248_IG16: cmp r9d, r8d jle SHORT G_M50248_IG08 - ;; size=5 bbWeight=12.92 PerfScore 16.15 + ;; size=5 bbWeight=12.77 PerfScore 15.97 G_M50248_IG17: jmp SHORT G_M50248_IG14 - ;; size=2 bbWeight=12.92 PerfScore 25.83 + ;; size=2 bbWeight=12.77 PerfScore 25.54 G_M50248_IG18: mov eax, dword ptr [rdx] mov r9d, dword ptr [r8] mov dword ptr [rdx], r9d mov dword ptr [r8], eax jmp SHORT G_M50248_IG11 - ;; size=13 bbWeight=0.47 PerfScore 3.73 + ;; size=13 bbWeight=0.46 PerfScore 3.65 G_M50248_IG19: mov eax, dword ptr [r10] mov r9d, dword ptr [r8] mov dword ptr [r10], r9d mov dword ptr [r8], eax jmp SHORT G_M50248_IG13 - ;; size=14 bbWeight=0.47 PerfScore 3.73 + ;; size=14 bbWeight=0.46 PerfScore 3.65 G_M50248_IG20: cmp rax, r10 jae SHORT G_M50248_IG22 - ;; size=5 bbWeight=14.72 PerfScore 18.40 + ;; size=5 bbWeight=14.85 PerfScore 18.56 G_M50248_IG21: mov r8d, dword ptr [rax] mov r11d, dword ptr [r10] mov dword ptr [rax], r11d mov dword ptr [r10], r8d jmp SHORT G_M50248_IG14 - ;; size=14 bbWeight=14.72 PerfScore 117.75 + ;; size=14 bbWeight=14.86 PerfScore 118.92 G_M50248_IG22: cmp rax, rcx je SHORT G_M50248_IG24 @@ -680,258 +685,19 @@ G_M50248_IG23: mov r9d, dword ptr [rcx] mov dword ptr [rax], r9d mov dword ptr [rcx], r10d - ;; size=12 bbWeight=0.98 PerfScore 5.87 + ;; size=12 bbWeight=0.97 PerfScore 5.82 G_M50248_IG24: sub rax, rdx - mov rcx, rax - sar rcx, 63 - and rcx, 3 - add rax, rcx + mov rdx, rax + sar rdx, 63 + and rdx, 3 + add rax, rdx sar rax, 2 ;; size=21 bbWeight=1 PerfScore 2.00 G_M50248_IG25: ret ;; size=1 bbWeight=1 PerfScore 1.00 -; Total bytes of code 255, prolog size 0, PerfScore 644.69, instruction count 93, allocated bytes for code 255 (MethodHash=e6183bb7) for method System.Collections.Generic.GenericArraySortHelper`1[System.Collections.IntStruct]:PickPivotAndPartition(System.Span`1[System.Collections.IntStruct]):int (Tier1) -; ============================================================ - -; Assembly listing for method System.Collections.Generic.GenericArraySortHelper`1[BenchmarkDotNet.Reports.Measurement]:PickPivotAndPartition(System.Span`1[BenchmarkDotNet.Reports.Measurement]):int (Instrumented Tier0) -; Emitting BLENDED_CODE for X64 with AVX - Windows -; Instrumented Tier0 code -; rbp based frame -; fully interruptible -; Final local variable assignments -; -; V00 arg0 [V00 ] ( 1, 1 ) byref -> [rbp+0x10] do-not-enreg[] ld-addr-op -; V01 loc0 [V01 ] ( 1, 1 ) byref -> [rbp-0x40] do-not-enreg[] must-init -; V02 loc1 [V02 ] ( 1, 1 ) byref -> [rbp-0x48] do-not-enreg[] must-init -; V03 loc2 [V03 ] ( 1, 1 ) byref -> [rbp-0x50] do-not-enreg[] must-init -; V04 loc3 [V04 ] ( 1, 1 ) byref -> [rbp-0x58] do-not-enreg[] must-init -; V05 loc4 [V05 ] ( 1, 1 ) struct (32) [rbp-0x78] do-not-enreg[XS] addr-exposed ld-addr-op -; V06 loc5 [V06 ] ( 1, 1 ) byref -> [rbp-0x80] do-not-enreg[] must-init -; V07 loc6 [V07 ] ( 1, 1 ) byref -> [rbp-0x88] do-not-enreg[] must-init -; V08 OutArgs [V08 ] ( 1, 1 ) struct (32) [rsp+0x00] do-not-enreg[XS] addr-exposed "OutgoingArgSpace" -; V09 tmp1 [V09 ] ( 1, 1 ) byref -> [rbp-0x90] do-not-enreg[] must-init "DUP instruction" -; V10 tmp2 [V10 ] ( 1, 1 ) byref -> [rbp-0x98] do-not-enreg[] must-init "DUP instruction" -; V11 tmp3 [V11 ] ( 1, 1 ) int -> [rbp-0xA0] do-not-enreg[X] addr-exposed "patchpoint counter" -; V12 tmp4 [V12 ] ( 1, 1 ) struct (16) [rbp-0xB0] do-not-enreg[XS] must-init addr-exposed "by-value struct argument" -; V13 tmp5 [V13 ] ( 1, 1 ) long -> [rbp-0xB8] "ReplaceWithLclVar is creating a new local variable" -; -; Lcl frame size = 224 - -G_M40648_IG01: - push rbp - sub rsp, 224 - vzeroupper - lea rbp, [rsp+0xE0] - vxorps xmm4, xmm4, xmm4 - vmovdqa xmmword ptr [rbp-0xB0], xmm4 - mov rax, -96 - vmovdqa xmmword ptr [rbp+rax-0x40], xmm4 - vmovdqa xmmword ptr [rbp+rax-0x30], xmm4 - vmovdqa xmmword ptr [rbp+rax-0x20], xmm4 - add rax, 48 - jne SHORT -5 instr - mov qword ptr [rbp-0x40], rax - mov bword ptr [rbp+0x10], rcx - ;; size=73 bbWeight=1 PerfScore 14.58 -G_M40648_IG02: - mov rcx, bword ptr [rbp+0x10] - ;; size=4 bbWeight=1 PerfScore 1.00 -G_M40648_IG03: - vmovdqu xmm0, xmmword ptr [rcx] - vmovdqu xmmword ptr [rbp-0xB0], xmm0 - ;; size=12 bbWeight=1 PerfScore 5.00 -G_M40648_IG04: - lea rcx, [rbp-0xB0] - call [] - mov bword ptr [rbp-0x40], rax - mov rcx, bword ptr [rbp+0x10] - mov ecx, dword ptr [rcx+0x08] - dec ecx - movsxd rcx, ecx - shl rcx, 5 - add rcx, bword ptr [rbp-0x40] - mov bword ptr [rbp-0x48], rcx - mov rcx, bword ptr [rbp+0x10] - mov ecx, dword ptr [rcx+0x08] - dec ecx - sar ecx, 1 - movsxd rcx, ecx - shl rcx, 5 - add rcx, bword ptr [rbp-0x40] - mov bword ptr [rbp-0x50], rcx - mov rcx, bword ptr [rbp-0x40] - mov rdx, bword ptr [rbp-0x50] - call [] - mov rcx, bword ptr [rbp-0x40] - mov rdx, bword ptr [rbp-0x48] - call [] - mov rcx, bword ptr [rbp-0x50] - mov rdx, bword ptr [rbp-0x48] - call [] - mov rcx, bword ptr [rbp+0x10] - mov ecx, dword ptr [rcx+0x08] - add ecx, -2 - movsxd rcx, ecx - shl rcx, 5 - add rcx, bword ptr [rbp-0x40] - mov bword ptr [rbp-0x58], rcx - mov rcx, bword ptr [rbp-0x50] - vmovdqu ymm0, ymmword ptr [rcx] - vmovdqu ymmword ptr [rbp-0x78], ymm0 - mov rcx, bword ptr [rbp-0x50] - mov rdx, bword ptr [rbp-0x58] - call [] - mov rcx, bword ptr [rbp-0x40] - mov bword ptr [rbp-0x80], rcx - mov rcx, bword ptr [rbp-0x58] - mov bword ptr [rbp-0x88], rcx - mov dword ptr [rbp-0xA0], 0x3E8 - jmp G_M40648_IG15 - ;; size=195 bbWeight=1 PerfScore 60.00 -G_M40648_IG05: - ;; size=0 bbWeight=1 PerfScore 0.00 -G_M40648_IG06: - mov ecx, dword ptr [rbp-0xA0] - dec ecx - mov dword ptr [rbp-0xA0], ecx - cmp dword ptr [rbp-0xA0], 0 - jg SHORT G_M40648_IG08 - ;; size=23 bbWeight=1 PerfScore 5.25 -G_M40648_IG07: - lea rcx, [rbp-0xA0] - mov edx, 181 - call CORINFO_HELP_PATCHPOINT - ;; size=17 bbWeight=0.01 PerfScore 0.02 -G_M40648_IG08: - mov rcx, bword ptr [rbp-0x80] - cmp rcx, bword ptr [rbp-0x58] - jae SHORT G_M40648_IG10 - mov rcx, bword ptr [rbp-0x80] - add rcx, 32 - mov bword ptr [rbp-0x98], rcx - mov rcx, bword ptr [rbp-0x98] - mov bword ptr [rbp-0x80], rcx - lea rcx, [rbp-0x78] - mov rdx, bword ptr [rbp-0x98] - call [] - test eax, eax - jne G_M40648_IG24 - ;; size=61 bbWeight=1 PerfScore 14.00 -G_M40648_IG09: - mov rcx, 0xD1FFAB1E - call CORINFO_HELP_COUNTPROFILE32 - ;; size=15 bbWeight=0.50 PerfScore 0.62 -G_M40648_IG10: - mov ecx, dword ptr [rbp-0xA0] - dec ecx - mov dword ptr [rbp-0xA0], ecx - cmp dword ptr [rbp-0xA0], 0 - jg SHORT G_M40648_IG12 - ;; size=23 bbWeight=1 PerfScore 5.25 -G_M40648_IG11: - lea rcx, [rbp-0xA0] - mov edx, 211 - call CORINFO_HELP_PATCHPOINT - ;; size=17 bbWeight=0.01 PerfScore 0.02 -G_M40648_IG12: - mov rcx, bword ptr [rbp-0x88] - cmp rcx, bword ptr [rbp-0x40] - jbe G_M40648_IG23 - mov ecx, 32 - neg rcx - add rcx, bword ptr [rbp-0x88] - mov bword ptr [rbp-0x90], rcx - mov rcx, bword ptr [rbp-0x90] - mov bword ptr [rbp-0x88], rcx - lea rcx, [rbp-0x78] - mov rdx, bword ptr [rbp-0x90] - call [] - test eax, eax - jne G_M40648_IG22 - ;; size=78 bbWeight=1 PerfScore 15.25 -G_M40648_IG13: - mov rcx, 0xD1FFAB1E - call CORINFO_HELP_COUNTPROFILE32 - ;; size=15 bbWeight=0.50 PerfScore 0.62 -G_M40648_IG14: - mov rcx, bword ptr [rbp-0x80] - cmp rcx, bword ptr [rbp-0x88] - jae G_M40648_IG21 - mov rcx, 0xD1FFAB1E - call CORINFO_HELP_COUNTPROFILE32 - mov rcx, bword ptr [rbp-0x80] - mov rdx, bword ptr [rbp-0x88] - call [] - ;; size=49 bbWeight=1 PerfScore 10.25 -G_M40648_IG15: - mov ecx, dword ptr [rbp-0xA0] - dec ecx - mov dword ptr [rbp-0xA0], ecx - cmp dword ptr [rbp-0xA0], 0 - jg SHORT G_M40648_IG17 - ;; size=23 bbWeight=1 PerfScore 5.25 -G_M40648_IG16: - lea rcx, [rbp-0xA0] - mov edx, 261 - call CORINFO_HELP_PATCHPOINT - ;; size=17 bbWeight=0.01 PerfScore 0.02 -G_M40648_IG17: - mov rcx, bword ptr [rbp-0x80] - cmp rcx, bword ptr [rbp-0x88] - jb G_M40648_IG05 - ;; size=17 bbWeight=1 PerfScore 4.00 -G_M40648_IG18: - mov rcx, bword ptr [rbp-0x80] - cmp rcx, bword ptr [rbp-0x58] - je SHORT G_M40648_IG19 - mov rcx, 0xD1FFAB1E - call CORINFO_HELP_COUNTPROFILE32 - mov rcx, bword ptr [rbp-0x80] - mov rdx, bword ptr [rbp-0x58] - call [] - ;; size=39 bbWeight=1 PerfScore 10.25 -G_M40648_IG19: - mov rcx, 0xD1FFAB1E - call CORINFO_HELP_COUNTPROFILE32 - mov rax, bword ptr [rbp-0x80] - sub rax, qword ptr [rbp-0x40] - mov qword ptr [rbp-0xB8], rax - mov rax, qword ptr [rbp-0xB8] - sar rax, 63 - and rax, 31 - add rax, qword ptr [rbp-0xB8] - sar rax, 5 - ;; size=56 bbWeight=1 PerfScore 9.50 -G_M40648_IG20: - vzeroupper - add rsp, 224 - pop rbp - ret - ;; size=12 bbWeight=1 PerfScore 2.75 -G_M40648_IG21: - mov rcx, 0xD1FFAB1E - call CORINFO_HELP_COUNTPROFILE32 - jmp SHORT G_M40648_IG18 - ;; size=17 bbWeight=0.50 PerfScore 1.62 -G_M40648_IG22: - mov rcx, 0xD1FFAB1E - call CORINFO_HELP_COUNTPROFILE32 - jmp G_M40648_IG10 - ;; size=20 bbWeight=0.50 PerfScore 1.62 -G_M40648_IG23: - mov rcx, 0xD1FFAB1E - call CORINFO_HELP_COUNTPROFILE32 - jmp G_M40648_IG14 - ;; size=20 bbWeight=0.50 PerfScore 1.62 -G_M40648_IG24: - mov rcx, 0xD1FFAB1E - call CORINFO_HELP_COUNTPROFILE32 - jmp G_M40648_IG06 - ;; size=20 bbWeight=0.50 PerfScore 1.62 - -; Total bytes of code 823, prolog size 73, PerfScore 252.44, instruction count 164, allocated bytes for code 823 (MethodHash=f0416137) for method System.Collections.Generic.GenericArraySortHelper`1[BenchmarkDotNet.Reports.Measurement]:PickPivotAndPartition(System.Span`1[BenchmarkDotNet.Reports.Measurement]):int (Instrumented Tier0) +; Total bytes of code 251, prolog size 0, PerfScore 641.43, instruction count 92, allocated bytes for code 251 (MethodHash=e6183bb7) for method System.Collections.Generic.GenericArraySortHelper`1[System.Collections.IntStruct]:PickPivotAndPartition(System.Span`1[System.Collections.IntStruct]):int (Tier1) ; ============================================================ ``` ### ``[System.Private.CoreLib]System.Collections.Generic.GenericArraySortHelper`1[System.Collections.IntStruct].IntroSort(value class System.Span`1,int32)`` ```diff ; optimized using Dynamic PGO ; rsp based frame ; fully interruptible -; with Dynamic PGO: edge weights are invalid, and fgCalledCount is 129624 +; with Dynamic PGO: edge weights are invalid, and fgCalledCount is 150820 ; 14 inlinees with PGO data; 18 single block inlinees; 5 inlinees without PGO data ; Final local variable assignments ; ; V00 arg0 [V00,T12] ( 4, 8 ) byref -> rcx ld-addr-op single-def -; V01 arg1 [V01,T13] ( 6, 6.88) int -> rbx -; V02 loc0 [V02,T11] ( 13, 13.48) int -> rbp -; V03 loc1 [V03,T19] ( 3, 3.91) int -> r14 -; V04 loc2 [V04,T27] ( 7, 0.46) byref -> rax single-def -; V05 loc3 [V05,T32] ( 6, 0.40) byref -> r8 single-def -; V06 loc4 [V06,T28] ( 6, 0.43) byref -> r10 single-def -; V07 loc5 [V07,T21] ( 3, 2.91) int -> rcx +; V01 arg1 [V01,T13] ( 6, 6.94) int -> rbx +; V02 loc0 [V02,T11] ( 13, 13.56) int -> rbp +; V03 loc1 [V03,T19] ( 3, 3.95) int -> r14 +; V04 loc2 [V04,T28] ( 7, 0.46) byref -> rax single-def +; V05 loc3 [V05,T32] ( 6, 0.40) byref -> rdx single-def +; V06 loc4 [V06,T27] ( 8, 0.50) byref -> r8 single-def +; V07 loc5 [V07,T21] ( 3, 2.95) int -> rcx ; V08 OutArgs [V08 ] ( 1, 1 ) struct (32) [rsp+0x00] do-not-enreg[XS] addr-exposed "OutgoingArgSpace" ;* V09 tmp1 [V09 ] ( 0, 0 ) struct (16) zero-ref "spilled call-like call argument" ;* V10 tmp2 [V10 ] ( 0, 0 ) struct (16) zero-ref "spilled call-like call argument" ;* V11 tmp3 [V11 ] ( 0, 0 ) struct (16) zero-ref "spilled call-like call argument" ;* V12 tmp4 [V12 ] ( 0, 0 ) struct (16) zero-ref "spilled call-like call argument" -; V13 tmp5 [V13,T39] ( 5, 0.20) byref -> rsi single-def "Inlining Arg" -; V14 tmp6 [V14,T49] ( 4, 0.15) byref -> rcx single-def "Inlining Arg" -;* V15 tmp7 [V15,T55] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" -;* V16 tmp8 [V16 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" -;* V17 tmp9 [V17,T56] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" -; V18 tmp10 [V18,T52] ( 2, 0.09) byref -> rdx single-def "Inlining Arg" -; V19 tmp11 [V19,T51] ( 3, 0.10) int -> rax "Inlining Arg" -;* V20 tmp12 [V20 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" -;* V21 tmp13 [V21,T42] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" -;* V22 tmp14 [V22 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" -;* V23 tmp15 [V23,T43] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" -; V24 tmp16 [V24,T33] ( 2, 0.35) byref -> rdx single-def "Inlining Arg" -; V25 tmp17 [V25,T29] ( 3, 0.42) int -> rcx "Inlining Arg" -;* V26 tmp18 [V26 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" -;* V27 tmp19 [V27,T44] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" -;* V28 tmp20 [V28 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" -;* V29 tmp21 [V29,T45] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" -; V30 tmp22 [V30,T34] ( 2, 0.35) byref -> rdx single-def "Inlining Arg" -; V31 tmp23 [V31,T30] ( 3, 0.42) int -> rcx "Inlining Arg" -;* V32 tmp24 [V32 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" -;* V33 tmp25 [V33,T40] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" -;* V34 tmp26 [V34 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" -;* V35 tmp27 [V35,T41] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" -; V36 tmp28 [V36,T35] ( 2, 0.35) byref -> r10 single-def "Inlining Arg" -; V37 tmp29 [V37,T31] ( 3, 0.42) int -> rdx "Inlining Arg" -;* V38 tmp30 [V38 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" -;* V39 tmp31 [V39 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "NewObj constructor temp" -;* V40 tmp32 [V40 ] ( 0, 0 ) byref -> zero-ref "Inlining Arg" -; V41 tmp33 [V41,T08] ( 5, 44.65) int -> r8 "Inline stloc first use temp" -;* V42 tmp34 [V42 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "Inlining Arg" -;* V43 tmp35 [V43 ] ( 0, 0 ) struct ( 8) zero-ref ld-addr-op "Inline stloc first use temp" -; V44 tmp36 [V44,T00] ( 8,194.23) int -> r8 "Inline stloc first use temp" -; V45 tmp37 [V45,T03] ( 2,104.01) byref -> r9 "impAppendStmt" +; V13 tmp5 [V13,T36] ( 7, 0.25) byref -> rsi single-def "Inlining Arg" +;* V14 tmp6 [V14 ] ( 0, 0 ) struct ( 8) zero-ref "spilling side-effects" +; V15 tmp7 [V15,T48] ( 4, 0.15) byref -> rcx single-def "Inlining Arg" +;* V16 tmp8 [V16,T53] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" +;* V17 tmp9 [V17 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" +;* V18 tmp10 [V18,T54] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" +;* V19 tmp11 [V19,T56] ( 0, 0 ) byref -> zero-ref single-def "Inlining Arg" +; V20 tmp12 [V20,T50] ( 3, 0.10) int -> rax "Inlining Arg" +;* V21 tmp13 [V21 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" +;* V22 tmp14 [V22 ] ( 0, 0 ) struct ( 8) zero-ref "spilling side-effects" +;* V23 tmp15 [V23,T41] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" +;* V24 tmp16 [V24 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" +;* V25 tmp17 [V25,T42] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" +; V26 tmp18 [V26,T33] ( 2, 0.35) byref -> r10 single-def "Inlining Arg" +; V27 tmp19 [V27,T29] ( 3, 0.41) int -> rcx "Inlining Arg" +;* V28 tmp20 [V28 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" +;* V29 tmp21 [V29 ] ( 0, 0 ) struct ( 8) zero-ref "spilling side-effects" +;* V30 tmp22 [V30,T46] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" +;* V31 tmp23 [V31 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" +;* V32 tmp24 [V32,T47] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" +; V33 tmp25 [V33,T34] ( 2, 0.35) byref -> r10 single-def "Inlining Arg" +; V34 tmp26 [V34,T30] ( 3, 0.41) int -> rcx "Inlining Arg" +;* V35 tmp27 [V35 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" +;* V36 tmp28 [V36 ] ( 0, 0 ) struct ( 8) zero-ref "spilling side-effects" +;* V37 tmp29 [V37,T39] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" +;* V38 tmp30 [V38 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" +;* V39 tmp31 [V39,T40] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" +; V40 tmp32 [V40,T35] ( 2, 0.35) byref -> r8 single-def "Inlining Arg" +; V41 tmp33 [V41,T31] ( 3, 0.41) int -> r10 "Inlining Arg" +;* V42 tmp34 [V42 ] ( 0, 0 ) struct ( 8) zero-ref "Inline stloc first use temp" +;* V43 tmp35 [V43 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "NewObj constructor temp" +;* V44 tmp36 [V44 ] ( 0, 0 ) byref -> zero-ref "Inlining Arg" +; V45 tmp37 [V45,T08] ( 5, 44.35) int -> rdx "Inline stloc first use temp" ;* V46 tmp38 [V46 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "Inlining Arg" -;* V47 tmp39 [V47 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "Inlining Arg" -;* V48 tmp40 [V48 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "Inlining Arg" -;* V49 tmp41 [V49 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "Inlining Arg" -;* V50 tmp42 [V50,T06] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" -;* V51 tmp43 [V51 ] ( 0, 0 ) byref -> zero-ref "Inlining Arg" -;* V52 tmp44 [V52 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" -;* V53 tmp45 [V53,T07] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" -; V54 tmp46 [V54,T01] ( 3,158.42) int -> r9 "Inlining Arg" -;* V55 tmp47 [V55 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "Inlining Arg" -;* V56 tmp48 [V56 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "NewObj constructor temp" -;* V57 tmp49 [V57 ] ( 0, 0 ) byref -> zero-ref "Inlining Arg" -;* V58 tmp50 [V58 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "NewObj constructor temp" -;* V59 tmp51 [V59 ] ( 0, 0 ) byref -> zero-ref "Inlining Arg" -; V60 tmp52 [V60,T14] ( 3, 7.82) int -> rbp "Inlining Arg" -;* V61 tmp53 [V61 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "NewObj constructor temp" -;* V62 tmp54 [V62 ] ( 0, 0 ) byref -> zero-ref "Inlining Arg" -; V63 tmp55 [V63,T15] ( 9, 5.10) byref -> rsi single-def "field V00._reference (fldOffset=0x0)" P-INDEP -; V64 tmp56 [V64,T16] ( 8, 5.90) int -> rdi "field V00._length (fldOffset=0x8)" P-INDEP -;* V65 tmp57 [V65 ] ( 0, 0 ) byref -> zero-ref "field V09._reference (fldOffset=0x0)" P-INDEP -;* V66 tmp58 [V66 ] ( 0, 0 ) int -> zero-ref "field V09._length (fldOffset=0x8)" P-INDEP -;* V67 tmp59 [V67 ] ( 0, 0 ) byref -> zero-ref "field V10._reference (fldOffset=0x0)" P-INDEP -;* V68 tmp60 [V68 ] ( 0, 0 ) int -> zero-ref "field V10._length (fldOffset=0x8)" P-INDEP -;* V69 tmp61 [V69 ] ( 0, 0 ) byref -> zero-ref "field V11._reference (fldOffset=0x0)" P-INDEP -;* V70 tmp62 [V70 ] ( 0, 0 ) int -> zero-ref "field V11._length (fldOffset=0x8)" P-INDEP -;* V71 tmp63 [V71 ] ( 0, 0 ) byref -> zero-ref "field V12._reference (fldOffset=0x0)" P-INDEP -;* V72 tmp64 [V72 ] ( 0, 0 ) int -> zero-ref "field V12._length (fldOffset=0x8)" P-INDEP -; V73 tmp65 [V73,T57] ( 2, 0.04) int -> rax "field V16._value (fldOffset=0x0)" P-INDEP -; V74 tmp66 [V74,T58] ( 2, 0.03) int -> rax "field V20._value (fldOffset=0x0)" P-INDEP -; V75 tmp67 [V75,T46] ( 2, 0.18) int -> rcx "field V22._value (fldOffset=0x0)" P-INDEP -; V76 tmp68 [V76,T53] ( 2, 0.08) int -> rdx "field V26._value (fldOffset=0x0)" P-INDEP -; V77 tmp69 [V77,T47] ( 2, 0.18) int -> rcx "field V28._value (fldOffset=0x0)" P-INDEP -; V78 tmp70 [V78,T54] ( 2, 0.08) int -> rdx "field V32._value (fldOffset=0x0)" P-INDEP -; V79 tmp71 [V79,T48] ( 2, 0.18) int -> rdx "field V34._value (fldOffset=0x0)" P-INDEP -; V80 tmp72 [V80,T50] ( 2, 0.12) int -> rcx "field V38._value (fldOffset=0x0)" P-INDEP -; V81 tmp73 [V81,T23] ( 2, 1.74) byref -> rax single-def "field V39._reference (fldOffset=0x0)" P-INDEP -; V82 tmp74 [V82,T24] ( 2, 1.74) int -> rcx "field V39._length (fldOffset=0x8)" P-INDEP -; V83 tmp75 [V83,T02] ( 6,107.98) byref -> rax single-def "field V42._reference (fldOffset=0x0)" P-INDEP -; V84 tmp76 [V84,T25] ( 2, 1.74) int -> rcx "field V42._length (fldOffset=0x8)" P-INDEP -; V85 tmp77 [V85,T04] ( 4, 67.88) int -> r10 "field V43._value (fldOffset=0x0)" P-INDEP -;* V86 tmp78 [V86 ] ( 0, 0 ) byref -> zero-ref "field V46._reference (fldOffset=0x0)" P-INDEP -;* V87 tmp79 [V87 ] ( 0, 0 ) int -> zero-ref "field V46._length (fldOffset=0x8)" P-INDEP -;* V88 tmp80 [V88 ] ( 0, 0 ) byref -> zero-ref "field V47._reference (fldOffset=0x0)" P-INDEP -;* V89 tmp81 [V89 ] ( 0, 0 ) int -> zero-ref "field V47._length (fldOffset=0x8)" P-INDEP -;* V90 tmp82 [V90 ] ( 0, 0 ) byref -> zero-ref "field V48._reference (fldOffset=0x0)" P-INDEP -;* V91 tmp83 [V91 ] ( 0, 0 ) int -> zero-ref "field V48._length (fldOffset=0x8)" P-INDEP -;* V92 tmp84 [V92 ] ( 0, 0 ) byref -> zero-ref "field V49._reference (fldOffset=0x0)" P-INDEP -;* V93 tmp85 [V93 ] ( 0, 0 ) int -> zero-ref "field V49._length (fldOffset=0x8)" P-INDEP -; V94 tmp86 [V94,T05] ( 2, 66.43) int -> r9 "field V52._value (fldOffset=0x0)" P-INDEP -;* V95 tmp87 [V95 ] ( 0, 0 ) byref -> zero-ref "field V55._reference (fldOffset=0x0)" P-INDEP -;* V96 tmp88 [V96 ] ( 0, 0 ) int -> zero-ref "field V55._length (fldOffset=0x8)" P-INDEP -; V97 tmp89 [V97,T59] ( 2, 0 ) byref -> rcx single-def "field V56._reference (fldOffset=0x0)" P-INDEP -; V98 tmp90 [V98,T60] ( 2, 0 ) int -> rax "field V56._length (fldOffset=0x8)" P-INDEP -;* V99 tmp91 [V99,T26] ( 0, 0 ) byref -> zero-ref "field V58._reference (fldOffset=0x0)" P-INDEP -; V100 tmp92 [V100,T22] ( 2, 1.94) int -> rcx "field V58._length (fldOffset=0x8)" P-INDEP -; V101 tmp93 [V101,T17] ( 2, 3.94) byref -> rcx "field V61._reference (fldOffset=0x0)" P-INDEP -; V102 tmp94 [V102,T18] ( 2, 3.94) int -> rbp "field V61._length (fldOffset=0x8)" P-INDEP -;* V103 tmp95 [V103 ] ( 0, 0 ) struct (16) zero-ref "Promoted implicit byref" -; V104 tmp96 [V104 ] ( 9, 17.65) struct (16) [rsp+0x20] do-not-enreg[XSF] must-init addr-exposed "by-value struct argument" -; V105 cse0 [V105,T36] ( 8, 0.32) int -> rdx "CSE - conservative" -; V106 cse1 [V106,T37] ( 4, 0.27) int -> rcx "CSE - conservative" -; V107 cse2 [V107,T38] ( 4, 0.25) int -> rdx "CSE - conservative" -; V108 cse3 [V108,T10] ( 3, 14.52) int -> rcx "CSE - moderate" -; V109 cse4 [V109,T09] ( 3, 31.00) int -> rdx "CSE - aggressive" -; V110 cse5 [V110,T20] ( 3, 3.91) long -> rcx "CSE - moderate" +;* V47 tmp39 [V47 ] ( 0, 0 ) struct ( 8) zero-ref ld-addr-op "Inline stloc first use temp" +; V48 tmp40 [V48,T00] ( 8,192.61) int -> rdx "Inline stloc first use temp" +; V49 tmp41 [V49,T03] ( 2,103.16) byref -> r9 "impAppendStmt" +;* V50 tmp42 [V50 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "Inlining Arg" +;* V51 tmp43 [V51 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "Inlining Arg" +;* V52 tmp44 [V52 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "Inlining Arg" +;* V53 tmp45 [V53 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "Inlining Arg" +;* V54 tmp46 [V54,T06] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp" +;* V55 tmp47 [V55 ] ( 0, 0 ) byref -> zero-ref "Inlining Arg" +;* V56 tmp48 [V56 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" +;* V57 tmp49 [V57,T07] ( 0, 0 ) int -> zero-ref "Inline return value spill temp" +; V58 tmp50 [V58,T01] ( 3,157.41) int -> r9 "Inlining Arg" +;* V59 tmp51 [V59 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "Inlining Arg" +;* V60 tmp52 [V60 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "NewObj constructor temp" +;* V61 tmp53 [V61 ] ( 0, 0 ) byref -> zero-ref "Inlining Arg" +;* V62 tmp54 [V62 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "NewObj constructor temp" +;* V63 tmp55 [V63 ] ( 0, 0 ) byref -> zero-ref "Inlining Arg" +; V64 tmp56 [V64,T14] ( 3, 7.91) int -> rbp "Inlining Arg" +;* V65 tmp57 [V65 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "NewObj constructor temp" +;* V66 tmp58 [V66 ] ( 0, 0 ) byref -> zero-ref "Inlining Arg" +; V67 tmp59 [V67,T15] ( 9, 5.12) byref -> rsi single-def "field V00._reference (fldOffset=0x0)" P-INDEP +; V68 tmp60 [V68,T16] ( 8, 5.93) int -> rdi "field V00._length (fldOffset=0x8)" P-INDEP +;* V69 tmp61 [V69 ] ( 0, 0 ) byref -> zero-ref "field V09._reference (fldOffset=0x0)" P-INDEP +;* V70 tmp62 [V70 ] ( 0, 0 ) int -> zero-ref "field V09._length (fldOffset=0x8)" P-INDEP +;* V71 tmp63 [V71 ] ( 0, 0 ) byref -> zero-ref "field V10._reference (fldOffset=0x0)" P-INDEP +;* V72 tmp64 [V72 ] ( 0, 0 ) int -> zero-ref "field V10._length (fldOffset=0x8)" P-INDEP +;* V73 tmp65 [V73 ] ( 0, 0 ) byref -> zero-ref "field V11._reference (fldOffset=0x0)" P-INDEP +;* V74 tmp66 [V74 ] ( 0, 0 ) int -> zero-ref "field V11._length (fldOffset=0x8)" P-INDEP +;* V75 tmp67 [V75 ] ( 0, 0 ) byref -> zero-ref "field V12._reference (fldOffset=0x0)" P-INDEP +;* V76 tmp68 [V76 ] ( 0, 0 ) int -> zero-ref "field V12._length (fldOffset=0x8)" P-INDEP +;* V77 tmp69 [V77 ] ( 0, 0 ) int -> zero-ref "field V14._value (fldOffset=0x0)" P-INDEP +; V78 tmp70 [V78,T55] ( 2, 0.04) int -> rax "field V17._value (fldOffset=0x0)" P-INDEP +; V79 tmp71 [V79,T57] ( 2, 0.03) int -> rax "field V21._value (fldOffset=0x0)" P-INDEP +;* V80 tmp72 [V80 ] ( 0, 0 ) int -> zero-ref "field V22._value (fldOffset=0x0)" P-INDEP +; V81 tmp73 [V81,T43] ( 2, 0.17) int -> rcx "field V24._value (fldOffset=0x0)" P-INDEP +; V82 tmp74 [V82,T51] ( 2, 0.08) int -> rcx "field V28._value (fldOffset=0x0)" P-INDEP +;* V83 tmp75 [V83 ] ( 0, 0 ) int -> zero-ref "field V29._value (fldOffset=0x0)" P-INDEP +; V84 tmp76 [V84,T44] ( 2, 0.17) int -> rcx "field V31._value (fldOffset=0x0)" P-INDEP +; V85 tmp77 [V85,T52] ( 2, 0.08) int -> r10 "field V35._value (fldOffset=0x0)" P-INDEP +;* V86 tmp78 [V86 ] ( 0, 0 ) int -> zero-ref "field V36._value (fldOffset=0x0)" P-INDEP +; V87 tmp79 [V87,T45] ( 2, 0.17) int -> r10 "field V38._value (fldOffset=0x0)" P-INDEP +; V88 tmp80 [V88,T49] ( 2, 0.12) int -> rcx "field V42._value (fldOffset=0x0)" P-INDEP +; V89 tmp81 [V89,T23] ( 2, 1.74) byref -> rax single-def "field V43._reference (fldOffset=0x0)" P-INDEP +; V90 tmp82 [V90,T24] ( 2, 1.74) int -> rcx "field V43._length (fldOffset=0x8)" P-INDEP +; V91 tmp83 [V91,T02] ( 6,107.15) byref -> rax single-def "field V46._reference (fldOffset=0x0)" P-INDEP +; V92 tmp84 [V92,T25] ( 2, 1.74) int -> rcx "field V46._length (fldOffset=0x8)" P-INDEP +; V93 tmp85 [V93,T04] ( 4, 67.49) int -> r10 "field V47._value (fldOffset=0x0)" P-INDEP +;* V94 tmp86 [V94 ] ( 0, 0 ) byref -> zero-ref "field V50._reference (fldOffset=0x0)" P-INDEP +;* V95 tmp87 [V95 ] ( 0, 0 ) int -> zero-ref "field V50._length (fldOffset=0x8)" P-INDEP +;* V96 tmp88 [V96 ] ( 0, 0 ) byref -> zero-ref "field V51._reference (fldOffset=0x0)" P-INDEP +;* V97 tmp89 [V97 ] ( 0, 0 ) int -> zero-ref "field V51._length (fldOffset=0x8)" P-INDEP +;* V98 tmp90 [V98 ] ( 0, 0 ) byref -> zero-ref "field V52._reference (fldOffset=0x0)" P-INDEP +;* V99 tmp91 [V99 ] ( 0, 0 ) int -> zero-ref "field V52._length (fldOffset=0x8)" P-INDEP +;* V100 tmp92 [V100 ] ( 0, 0 ) byref -> zero-ref "field V53._reference (fldOffset=0x0)" P-INDEP +;* V101 tmp93 [V101 ] ( 0, 0 ) int -> zero-ref "field V53._length (fldOffset=0x8)" P-INDEP +; V102 tmp94 [V102,T05] ( 2, 65.92) int -> r9 "field V56._value (fldOffset=0x0)" P-INDEP +;* V103 tmp95 [V103 ] ( 0, 0 ) byref -> zero-ref "field V59._reference (fldOffset=0x0)" P-INDEP +;* V104 tmp96 [V104 ] ( 0, 0 ) int -> zero-ref "field V59._length (fldOffset=0x8)" P-INDEP +; V105 tmp97 [V105,T58] ( 2, 0 ) byref -> rcx single-def "field V60._reference (fldOffset=0x0)" P-INDEP +; V106 tmp98 [V106,T59] ( 2, 0 ) int -> rax "field V60._length (fldOffset=0x8)" P-INDEP +;* V107 tmp99 [V107,T26] ( 0, 0 ) byref -> zero-ref "field V62._reference (fldOffset=0x0)" P-INDEP +; V108 tmp100 [V108,T22] ( 2, 1.97) int -> rcx "field V62._length (fldOffset=0x8)" P-INDEP +; V109 tmp101 [V109,T17] ( 2, 3.97) byref -> rcx "field V65._reference (fldOffset=0x0)" P-INDEP +; V110 tmp102 [V110,T18] ( 2, 3.97) int -> rbp "field V65._length (fldOffset=0x8)" P-INDEP +;* V111 tmp103 [V111 ] ( 0, 0 ) struct (16) zero-ref "Promoted implicit byref" +; V112 tmp104 [V112 ] ( 9, 17.82) struct (16) [rsp+0x20] do-not-enreg[XSF] must-init addr-exposed "by-value struct argument" +; V113 cse0 [V113,T37] ( 4, 0.27) int -> rcx "CSE - conservative" +; V114 cse1 [V114,T38] ( 4, 0.25) int -> r10 "CSE - conservative" +; V115 cse2 [V115,T10] ( 3, 14.52) int -> rcx "CSE - moderate" +; V116 cse3 [V116,T09] ( 3, 30.70) int -> r8 "CSE - aggressive" +; V117 cse4 [V117,T20] ( 3, 3.95) long -> rcx "CSE - moderate" ; ; Lcl frame size = 48 @@ -810,7 +817,7 @@ G_M61030_IG02: G_M61030_IG03: cmp ebp, 16 jle SHORT G_M61030_IG07 - ;; size=5 bbWeight=1.95 PerfScore 2.44 + ;; size=5 bbWeight=1.96 PerfScore 2.45 G_M61030_IG04: test ebx, ebx je G_M61030_IG31 @@ -830,7 +837,7 @@ G_M61030_IG04: mov eax, edi cmp rdx, rax ja G_M61030_IG32 - ;; size=65 bbWeight=0.97 PerfScore 11.16 + ;; size=65 bbWeight=0.98 PerfScore 11.32 G_M61030_IG05: lea rcx, bword ptr [rsi+4*rcx] mov bword ptr [rsp+0x20], rcx @@ -841,7 +848,7 @@ G_M61030_IG05: mov ebp, r14d cmp ebp, 1 jg SHORT G_M61030_IG03 - ;; size=34 bbWeight=1.97 PerfScore 15.27 + ;; size=34 bbWeight=1.98 PerfScore 15.38 G_M61030_IG06: jmp G_M61030_IG20 align [0 bytes for IG11] @@ -859,47 +866,47 @@ G_M61030_IG09: ja G_M61030_IG32 mov rax, rsi mov ecx, ebp - xor r8d, r8d + xor edx, edx dec ecx test ecx, ecx jle SHORT G_M61030_IG20 - ;; size=22 bbWeight=0.87 PerfScore 3.05 + ;; size=21 bbWeight=0.87 PerfScore 3.04 G_M61030_IG10: - lea edx, [r8+0x01] - movsxd r10, edx + lea r8d, [rdx+0x01] + movsxd r10, r8d mov r10d, dword ptr [rax+4*r10] jmp SHORT G_M61030_IG12 - ;; size=13 bbWeight=9.11 PerfScore 43.28 + ;; size=13 bbWeight=8.96 PerfScore 42.57 G_M61030_IG11: - lea r9d, [r8+0x01] + lea r9d, [rdx+0x01] movsxd r9, r9d lea r9, bword ptr [rax+4*r9] - movsxd r11, r8d + movsxd r11, edx mov r11d, dword ptr [rax+4*r11] mov dword ptr [r9], r11d - dec r8d - ;; size=24 bbWeight=26.00 PerfScore 123.51 + dec edx + ;; size=23 bbWeight=25.79 PerfScore 122.50 G_M61030_IG12: - test r8d, r8d + test edx, edx jl SHORT G_M61030_IG15 - ;; size=5 bbWeight=35.11 PerfScore 43.89 + ;; size=4 bbWeight=34.75 PerfScore 43.44 G_M61030_IG13: - movsxd r9, r8d + movsxd r9, edx mov r9d, dword ptr [rax+4*r9] cmp r10d, r9d jl SHORT G_M61030_IG11 - ;; size=12 bbWeight=33.21 PerfScore 116.25 + ;; size=12 bbWeight=32.96 PerfScore 115.36 G_M61030_IG14: cmp r10d, r9d - ;; size=3 bbWeight=12.78 PerfScore 3.20 + ;; size=3 bbWeight=12.79 PerfScore 3.20 G_M61030_IG15: - inc r8d - movsxd r8, r8d - mov dword ptr [rax+4*r8], r10d - mov r8d, edx - cmp r8d, ecx + inc edx + movsxd rdx, edx + mov dword ptr [rax+4*rdx], r10d + mov edx, r8d + cmp edx, ecx jl SHORT G_M61030_IG10 - ;; size=18 bbWeight=12.78 PerfScore 38.33 + ;; size=16 bbWeight=12.78 PerfScore 38.34 G_M61030_IG16: jmp SHORT G_M61030_IG20 ;; size=2 bbWeight=0.79 PerfScore 1.59 @@ -907,21 +914,19 @@ G_M61030_IG17: lea rcx, bword ptr [rsi+0x04] cmp byte ptr [rsi], sil mov eax, dword ptr [rcx] - mov rdx, rsi - mov edx, dword ptr [rdx] - cmp edx, eax + cmp dword ptr [rsi], eax jl SHORT G_M61030_IG20 - ;; size=18 bbWeight=0.02 PerfScore 0.20 + ;; size=13 bbWeight=0.02 PerfScore 0.21 G_M61030_IG18: - cmp edx, eax + cmp dword ptr [rsi], eax jle SHORT G_M61030_IG20 - ;; size=4 bbWeight=0.01 PerfScore 0.01 + ;; size=4 bbWeight=0.01 PerfScore 0.03 G_M61030_IG19: - mov eax, edx + mov eax, dword ptr [rsi] mov edx, dword ptr [rcx] mov dword ptr [rsi], edx mov dword ptr [rcx], eax - ;; size=8 bbWeight=0.01 PerfScore 0.06 + ;; size=8 bbWeight=0.02 PerfScore 0.09 G_M61030_IG20: add rsp, 48 pop rbx @@ -935,61 +940,61 @@ G_M61030_IG21: cmp edi, 2 jbe G_M61030_IG33 lea rax, bword ptr [rsi+0x08] - lea r8, bword ptr [rsi+0x04] - mov r10, rsi - cmp byte ptr [r10], r10b - mov ecx, dword ptr [r8] - mov rdx, r10 - mov edx, dword ptr [rdx] - cmp edx, ecx + lea rdx, bword ptr [rsi+0x04] + mov r8, rsi + cmp byte ptr [r8], r8b + mov ecx, dword ptr [rdx] + mov r10, r8 + cmp dword ptr [r10], ecx jl SHORT G_M61030_IG23 - ;; size=35 bbWeight=0.09 PerfScore 0.96 + ;; size=33 bbWeight=0.09 PerfScore 1.02 G_M61030_IG22: - cmp edx, ecx + cmp dword ptr [r8], ecx jg SHORT G_M61030_IG26 - ;; size=4 bbWeight=0.03 PerfScore 0.04 + ;; size=5 bbWeight=0.03 PerfScore 0.14 G_M61030_IG23: mov ecx, dword ptr [rax] - mov rdx, r10 - mov edx, dword ptr [rdx] - cmp edx, ecx + mov r10, r8 + mov r10d, dword ptr [r10] + cmp r10d, ecx jl SHORT G_M61030_IG28 - ;; size=11 bbWeight=0.09 PerfScore 0.48 + ;; size=13 bbWeight=0.09 PerfScore 0.48 G_M61030_IG24: - cmp edx, ecx + cmp r10d, ecx jle SHORT G_M61030_IG28 - ;; size=4 bbWeight=0.03 PerfScore 0.04 + ;; size=5 bbWeight=0.03 PerfScore 0.04 G_M61030_IG25: jmp SHORT G_M61030_IG27 ;; size=2 bbWeight=0.03 PerfScore 0.07 G_M61030_IG26: mov ecx, dword ptr [r8] - mov dword ptr [r10], ecx - mov dword ptr [r8], edx + mov r10d, dword ptr [rdx] + mov dword ptr [r8], r10d + mov dword ptr [rdx], ecx jmp SHORT G_M61030_IG23 - ;; size=11 bbWeight=0.04 PerfScore 0.24 + ;; size=13 bbWeight=0.04 PerfScore 0.32 G_M61030_IG27: mov ecx, dword ptr [rax] - mov dword ptr [r10], ecx - mov dword ptr [rax], edx - ;; size=7 bbWeight=0.04 PerfScore 0.16 + mov dword ptr [r8], ecx + mov dword ptr [rax], r10d + ;; size=8 bbWeight=0.04 PerfScore 0.16 G_M61030_IG28: - mov edx, dword ptr [rax] - mov r10, r8 - mov ecx, dword ptr [r10] - cmp ecx, edx + mov r10d, dword ptr [rax] + mov r8, rdx + mov ecx, dword ptr [r8] + cmp ecx, r10d jl SHORT G_M61030_IG20 - ;; size=12 bbWeight=0.09 PerfScore 0.48 + ;; size=14 bbWeight=0.09 PerfScore 0.48 G_M61030_IG29: - cmp ecx, edx + cmp ecx, r10d jle SHORT G_M61030_IG20 - ;; size=4 bbWeight=0.03 PerfScore 0.04 + ;; size=5 bbWeight=0.03 PerfScore 0.04 G_M61030_IG30: - mov edx, dword ptr [rax] - mov dword ptr [r8], edx + mov r8d, dword ptr [rax] + mov dword ptr [rdx], r8d mov dword ptr [rax], ecx jmp SHORT G_M61030_IG20 - ;; size=9 bbWeight=0.06 PerfScore 0.36 + ;; size=10 bbWeight=0.06 PerfScore 0.36 G_M61030_IG31: cmp ebp, edi ja SHORT G_M61030_IG32 @@ -1010,6 +1015,6 @@ G_M61030_IG33: int3 ;; size=6 bbWeight=0 PerfScore 0.00 -; Total bytes of code 445, prolog size 19, PerfScore 469.04, instruction count 154, allocated bytes for code 445 (MethodHash=a0751199) for method System.Collections.Generic.GenericArraySortHelper`1[System.Collections.IntStruct]:IntroSort(System.Span`1[System.Collections.IntStruct],int) (Tier1) +; Total bytes of code 444, prolog size 19, PerfScore 466.44, instruction count 152, allocated bytes for code 444 (MethodHash=a0751199) for method System.Collections.Generic.GenericArraySortHelper`1[System.Collections.IntStruct]:IntroSort(System.Span`1[System.Collections.IntStruct],int) (Tier1) ; ============================================================ ```
EgorBo commented 1 year ago

@jakobbotsch thanks for the investigation!! Indeed seems like my PR introduced a bit more conservative side-effect extraction than previously, I'll check whether I can relax that or handle downstream but looks like only one machine regressed according to ADX query:

image

so I am going to move this to 9.0 as non-important.

EgorBo commented 1 year ago

I didn't blame my PR initially because I thought it's Tier0 only but you correctly pointed out it's not

EgorBo commented 5 months ago

Looks like the regression was either fixed or compensated by https://github.com/dotnet/runtime/pull/98324