[Perf] Linux/x64: 49 Regressions on 3/7/2024 2:55:42 PM

performanceautofiler[bot] commented 4 months ago

Run Information

Name	Value
Architecture	x64
OS	ubuntu 22.04
Queue	TigerUbuntu
Baseline	8330db998659c4e6410aba370b37e4304a517a2b
Compare	c806bf697035ee47589e246ea6f6453811d6cd40
Diff	Diff
Configs	CompilationMode:wasm, RunKind:micro

Regressions in System.Tests.Perf_String

Benchmark	Baseline	Test	Test/Base	Test Quality	Edge Detector
[Substring_IntInt - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Tests.Perf_String.Substring_IntInt(s%3a%20%22dzsdzsDDZSDZSDZSddsz%22%2c%20i1%3a%2010%2c%20i2%3a%201).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	102.40 ns	120.22 ns	1.17	0.25	False
[Substring_IntInt - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Tests.Perf_String.Substring_IntInt(s%3a%20%22dzsdzsDDZSDZSDZSddsz%22%2c%20i1%3a%200%2c%20i2%3a%208).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	105.91 ns	125.72 ns	1.19	0.25	False
[Remove_Int - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Tests.Perf_String.Remove_Int(s%3a%20%22dzsdzsDDZSDZSDZSddsz%22%2c%20i%3a%2010).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	136.31 ns	154.74 ns	1.14	0.22	False
[TrimEnd - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Tests.Perf_String.TrimEnd(s%3a%20%22Test%20%22).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	154.36 ns	193.55 ns	1.25	0.19	False
[Trim_CharArr - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Tests.Perf_String.Trim_CharArr(s%3a%20%22%20Test%22%2c%20c%3a%20%5b%27%20%27%2c%20%27%e2%80%85%27%5d).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	167.54 ns	188.93 ns	1.13	0.23	False
[TrimEnd_CharArr - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Tests.Perf_String.TrimEnd_CharArr(s%3a%20%22Test%20%22%2c%20c%3a%20%5b%27%20%27%2c%20%27%e2%80%85%27%5d).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	169.54 ns	194.51 ns	1.15	0.24	False

graph graph graph graph graph graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Tests.Perf_String*'

### Payloads [Baseline]() [Compare]() ### System.Tests.Perf_String.Substring_IntInt(s: "dzsdzsDDZSDZSDZSddsz", i1: 10, i2: 1) #### ETL Files #### Histogram #### JIT Disasms ### System.Tests.Perf_String.Substring_IntInt(s: "dzsdzsDDZSDZSDZSddsz", i1: 0, i2: 8) #### ETL Files #### Histogram #### JIT Disasms ### System.Tests.Perf_String.Remove_Int(s: "dzsdzsDDZSDZSDZSddsz", i: 10) #### ETL Files #### Histogram #### JIT Disasms ### System.Tests.Perf_String.TrimEnd(s: "Test ") #### ETL Files #### Histogram #### JIT Disasms ### System.Tests.Perf_String.Trim_CharArr(s: " Test", c: [' ', ' ']) #### ETL Files #### Histogram #### JIT Disasms ### System.Tests.Perf_String.TrimEnd_CharArr(s: "Test ", c: [' ', ' ']) #### ETL Files #### Histogram #### JIT Disasms ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Name	Value
Architecture	x64
OS	ubuntu 22.04
Queue	TigerUbuntu
Baseline	8330db998659c4e6410aba370b37e4304a517a2b
Compare	c806bf697035ee47589e246ea6f6453811d6cd40
Diff	Diff
Configs	CompilationMode:wasm, RunKind:micro

Regressions in System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Double>

Benchmark	Baseline	Test	Test/Base	Test Quality	Edge Detector
[Sigmoid - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Double).Sigmoid(BufferLength%3a%203079).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	2.06 ms	2.38 ms	1.15	0.27	True
[Exp - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Double).Exp(BufferLength%3a%20128).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	73.55 μs	85.90 μs	1.17	0.32	True
[Pow_Vectors - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Double).Pow_Vectors(BufferLength%3a%20128).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	219.68 μs	268.27 μs	1.22	0.44	True
[Ieee754Remainder_ScalarDivisor - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Double).Ieee754Remainder_ScalarDivisor(BufferLength%3a%203079).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	339.31 μs	362.29 μs	1.07	0.06	True
[Distance - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Double).Distance(BufferLength%3a%203079).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	136.79 μs	151.91 μs	1.11	0.31	False
[Pow_ScalarBase - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Double).Pow_ScalarBase(BufferLength%3a%20128).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	207.34 μs	263.50 μs	1.27	0.50	False
[Sigmoid - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Double).Sigmoid(BufferLength%3a%20128).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	83.87 μs	103.78 μs	1.24	0.24	True
[Sin - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Double).Sin(BufferLength%3a%203079).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	1.28 ms	1.56 ms	1.22	0.24	False
[Distance - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Double).Distance(BufferLength%3a%20128).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	6.12 μs	6.96 μs	1.14	0.16	True
[Pow_Vectors - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Double).Pow_Vectors(BufferLength%3a%203079).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	4.94 ms	6.78 ms	1.37	0.47	True
[Sin - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Double).Sin(BufferLength%3a%20128).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	54.34 μs	65.55 μs	1.21	0.29	False
[Sinh - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Double).Sinh(BufferLength%3a%20128).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	88.07 μs	104.81 μs	1.19	0.42	False
[Ieee754Remainder_ScalarDividend - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Double).Ieee754Remainder_ScalarDividend(BufferLength%3a%203079).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	310.93 μs	345.99 μs	1.11	0.19	True
[Sinh - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Double).Sinh(BufferLength%3a%203079).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	2.09 ms	2.58 ms	1.23	0.35	True
[Log - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Double).Log(BufferLength%3a%203079).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	2.30 ms	2.74 ms	1.19	0.49	False
[Pow_ScalarExponent - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Double).Pow_ScalarExponent(BufferLength%3a%203079).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	5.53 ms	6.44 ms	1.17	0.41	True
[Pow_ScalarBase - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Double).Pow_ScalarBase(BufferLength%3a%203079).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	4.92 ms	7.09 ms	1.44	0.39	True
[Pow_ScalarExponent - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Double).Pow_ScalarExponent(BufferLength%3a%20128).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	217.74 μs	276.85 μs	1.27	0.52	True
[Ieee754Remainder_ScalarDividend - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Double).Ieee754Remainder_ScalarDividend(BufferLength%3a%20128).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	12.71 μs	14.45 μs	1.14	0.21	False
[Log - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Double).Log(BufferLength%3a%20128).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	98.42 μs	115.08 μs	1.17	0.36	False
[Exp - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Double).Exp(BufferLength%3a%203079).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	1.87 ms	2.19 ms	1.17	0.29	True

graph graph graph graph graph graph graph graph graph graph graph graph graph graph graph graph graph graph graph graph graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives&lt;Double&gt;*'

### Payloads [Baseline]() [Compare]() ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Double>.Sigmoid(BufferLength: 3079) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Double>.Exp(BufferLength: 128) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Double>.Pow_Vectors(BufferLength: 128) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Double>.Ieee754Remainder_ScalarDivisor(BufferLength: 3079) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Double>.Distance(BufferLength: 3079) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Double>.Pow_ScalarBase(BufferLength: 128) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Double>.Sigmoid(BufferLength: 128) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Double>.Sin(BufferLength: 3079) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Double>.Distance(BufferLength: 128) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Double>.Pow_Vectors(BufferLength: 3079) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Double>.Sin(BufferLength: 128) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Double>.Sinh(BufferLength: 128) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Double>.Ieee754Remainder_ScalarDividend(BufferLength: 3079) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Double>.Sinh(BufferLength: 3079) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Double>.Log(BufferLength: 3079) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Double>.Pow_ScalarExponent(BufferLength: 3079) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Double>.Pow_ScalarBase(BufferLength: 3079) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Double>.Pow_ScalarExponent(BufferLength: 128) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Double>.Ieee754Remainder_ScalarDividend(BufferLength: 128) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Double>.Log(BufferLength: 128) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Double>.Exp(BufferLength: 3079) #### ETL Files #### Histogram #### JIT Disasms ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

performanceautofiler[bot] commented 4 months ago

Run Information

Name	Value
Architecture	x64
OS	ubuntu 22.04
Queue	TigerUbuntu
Baseline	8330db998659c4e6410aba370b37e4304a517a2b
Compare	c806bf697035ee47589e246ea6f6453811d6cd40
Diff	Diff
Configs	CompilationMode:wasm, RunKind:micro

Regressions in System.Numerics.Tensors.Tests.Perf_NumberTensorPrimitives<Double>

Benchmark	Baseline	Test	Test/Base	Test Quality	Edge Detector
[Negate - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_NumberTensorPrimitives(Double).Negate(BufferLength%3a%20128).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	1.83 μs	2.25 μs	1.23	0.24	False
[SumOfMagnitudes - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_NumberTensorPrimitives(Double).SumOfMagnitudes(BufferLength%3a%20128).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	2.80 μs	3.49 μs	1.25	0.13	True
[Add_Scalar - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_NumberTensorPrimitives(Double).Add_Scalar(BufferLength%3a%203079).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	32.00 μs	36.39 μs	1.14	0.26	False
[SumOfSquares - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_NumberTensorPrimitives(Double).SumOfSquares(BufferLength%3a%20128).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	3.14 μs	3.54 μs	1.13	0.24	False
[Divide_Scalar - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_NumberTensorPrimitives(Double).Divide_Scalar(BufferLength%3a%20128).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	1.66 μs	1.77 μs	1.07	0.21	False
[AddMultiply_ScalarAddend - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_NumberTensorPrimitives(Double).AddMultiply_ScalarAddend(BufferLength%3a%20128).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	3.14 μs	3.62 μs	1.15	0.06	True
[Add_Scalar - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_NumberTensorPrimitives(Double).Add_Scalar(BufferLength%3a%20128).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	1.64 μs	1.72 μs	1.05	0.24	False
[AddMultiply_Vectors - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_NumberTensorPrimitives(Double).AddMultiply_Vectors(BufferLength%3a%20128).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	3.57 μs	3.99 μs	1.12	0.18	False

graph graph graph graph graph graph graph graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Numerics.Tensors.Tests.Perf_NumberTensorPrimitives&lt;Double&gt;*'

### Payloads [Baseline]() [Compare]() ### System.Numerics.Tensors.Tests.Perf_NumberTensorPrimitives<Double>.Negate(BufferLength: 128) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_NumberTensorPrimitives<Double>.SumOfMagnitudes(BufferLength: 128) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_NumberTensorPrimitives<Double>.Add_Scalar(BufferLength: 3079) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_NumberTensorPrimitives<Double>.SumOfSquares(BufferLength: 128) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_NumberTensorPrimitives<Double>.Divide_Scalar(BufferLength: 128) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_NumberTensorPrimitives<Double>.AddMultiply_ScalarAddend(BufferLength: 128) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_NumberTensorPrimitives<Double>.Add_Scalar(BufferLength: 128) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_NumberTensorPrimitives<Double>.AddMultiply_Vectors(BufferLength: 128) #### ETL Files #### Histogram #### JIT Disasms ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Name	Value
Architecture	x64
OS	ubuntu 22.04
Queue	TigerUbuntu
Baseline	8330db998659c4e6410aba370b37e4304a517a2b
Compare	c806bf697035ee47589e246ea6f6453811d6cd40
Diff	Diff
Configs	CompilationMode:wasm, RunKind:micro

Regressions in System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Single>

Benchmark	Baseline	Test	Test/Base	Test Quality	Edge Detector
[Exp - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Single).Exp(BufferLength%3a%20128).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	37.33 μs	41.93 μs	1.12	0.44	False
[Ieee754Remainder_ScalarDivisor - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Single).Ieee754Remainder_ScalarDivisor(BufferLength%3a%20128).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	13.53 μs	14.58 μs	1.08	0.08	False
[Sinh - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Single).Sinh(BufferLength%3a%203079).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	944.38 μs	1.16 ms	1.23	0.32	False
[Sinh - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Single).Sinh(BufferLength%3a%20128).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	37.94 μs	48.29 μs	1.27	0.42	False
[Pow_ScalarBase - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Single).Pow_ScalarBase(BufferLength%3a%20128).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	62.23 μs	70.05 μs	1.13	0.47	False
[Sigmoid - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Single).Sigmoid(BufferLength%3a%20128).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	37.58 μs	50.85 μs	1.35	0.42	False
[Sigmoid - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Single).Sigmoid(BufferLength%3a%203079).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	942.43 μs	1.11 ms	1.18	0.48	False
[Pow_ScalarExponent - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives(Single).Pow_ScalarExponent(BufferLength%3a%203079).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	1.52 ms	1.65 ms	1.09	0.37	False

graph graph graph graph graph graph graph graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives&lt;Single&gt;*'

### Payloads [Baseline]() [Compare]() ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Single>.Exp(BufferLength: 128) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Single>.Ieee754Remainder_ScalarDivisor(BufferLength: 128) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Single>.Sinh(BufferLength: 3079) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Single>.Sinh(BufferLength: 128) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Single>.Pow_ScalarBase(BufferLength: 128) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Single>.Sigmoid(BufferLength: 128) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Single>.Sigmoid(BufferLength: 3079) #### ETL Files #### Histogram #### JIT Disasms ### System.Numerics.Tensors.Tests.Perf_FloatingPointTensorPrimitives<Single>.Pow_ScalarExponent(BufferLength: 3079) #### ETL Files #### Histogram #### JIT Disasms ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Name	Value
Architecture	x64
OS	ubuntu 22.04
Queue	TigerUbuntu
Baseline	8330db998659c4e6410aba370b37e4304a517a2b
Compare	c806bf697035ee47589e246ea6f6453811d6cd40
Diff	Diff
Configs	CompilationMode:wasm, RunKind:micro

Regressions in Struct.GSeq

Benchmark	Baseline	Test	Test/Base	Test Quality	Edge Detector	Baseline IR	Compare IR	IR Ratio
[FilterSkipMapSum - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/Struct.GSeq.FilterSkipMapSum.html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	684.52 μs	787.38 μs	1.15	0.18	False

graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'Struct.GSeq*'

### Payloads [Baseline]() [Compare]() ### Struct.GSeq.FilterSkipMapSum #### ETL Files #### Histogram #### JIT Disasms ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Name	Value
Architecture	x64
OS	ubuntu 22.04
Queue	TigerUbuntu
Baseline	8330db998659c4e6410aba370b37e4304a517a2b
Compare	c806bf697035ee47589e246ea6f6453811d6cd40
Diff	Diff
Configs	CompilationMode:wasm, RunKind:micro

Regressions in System.Memory.Span<Char>

Benchmark	Baseline	Test	Test/Base	Test Quality	Edge Detector	Baseline IR	Compare IR	IR Ratio
[SequenceEqual - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Memory.Span(Char).SequenceEqual(Size%3a%20512).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	297.37 ns	321.47 ns	1.08	0.10	False

graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Memory.Span&lt;Char&gt;*'

### Payloads [Baseline]() [Compare]() ### System.Memory.Span<Char>.SequenceEqual(Size: 512) #### ETL Files #### Histogram #### JIT Disasms ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Name	Value
Architecture	x64
OS	ubuntu 22.04
Queue	TigerUbuntu
Baseline	8330db998659c4e6410aba370b37e4304a517a2b
Compare	c806bf697035ee47589e246ea6f6453811d6cd40
Diff	Diff
Configs	CompilationMode:wasm, RunKind:micro

Regressions in System.Collections.CtorGivenSize<String>

Benchmark	Baseline	Test	Test/Base	Test Quality	Edge Detector	Baseline IR	Compare IR	IR Ratio
[List - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Collections.CtorGivenSize(String).List(Size%3a%20512).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	215.89 ns	241.29 ns	1.12	0.12	False

graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Collections.CtorGivenSize&lt;String&gt;*'

### Payloads [Baseline]() [Compare]() ### System.Collections.CtorGivenSize<String>.List(Size: 512) #### ETL Files #### Histogram #### JIT Disasms ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Name	Value
Architecture	x64
OS	ubuntu 22.04
Queue	TigerUbuntu
Baseline	8330db998659c4e6410aba370b37e4304a517a2b
Compare	c806bf697035ee47589e246ea6f6453811d6cd40
Diff	Diff
Configs	CompilationMode:wasm, RunKind:micro

Regressions in System.Collections.CtorFromCollectionNonGeneric<Int32>

Benchmark	Baseline	Test	Test/Base	Test Quality	Edge Detector	Baseline IR	Compare IR	IR Ratio
[ArrayList - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Collections.CtorFromCollectionNonGeneric(Int32).ArrayList(Size%3a%20512).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	154.21 μs	178.11 μs	1.16	0.07	False

graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Collections.CtorFromCollectionNonGeneric&lt;Int32&gt;*'

### Payloads [Baseline]() [Compare]() ### System.Collections.CtorFromCollectionNonGeneric<Int32>.ArrayList(Size: 512) #### ETL Files #### Histogram #### JIT Disasms ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Name	Value
Architecture	x64
OS	ubuntu 22.04
Queue	TigerUbuntu
Baseline	8330db998659c4e6410aba370b37e4304a517a2b
Compare	c806bf697035ee47589e246ea6f6453811d6cd40
Diff	Diff
Configs	CompilationMode:wasm, RunKind:micro

Regressions in SciMark2.kernel

Benchmark	Baseline	Test	Test/Base	Test Quality	Edge Detector	Baseline IR	Compare IR	IR Ratio
[benchSOR - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/SciMark2.kernel.benchSOR.html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	2.02 secs	2.17 secs	1.08	0.01	False

graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'SciMark2.kernel*'

### Payloads [Baseline]() [Compare]() ### SciMark2.kernel.benchSOR #### ETL Files #### Histogram #### JIT Disasms ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Name	Value
Architecture	x64
OS	ubuntu 22.04
Queue	TigerUbuntu
Baseline	8330db998659c4e6410aba370b37e4304a517a2b
Compare	c806bf697035ee47589e246ea6f6453811d6cd40
Diff	Diff
Configs	CompilationMode:wasm, RunKind:micro

Regressions in System.Text.Json.Tests.Perf_Guids

Benchmark	Baseline	Test	Test/Base	Test Quality	Edge Detector	Baseline IR	Compare IR	IR Ratio
[WriteGuids - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_CompilationMode=wasm_RunKind=micro/System.Text.Json.Tests.Perf_Guids.WriteGuids(Formatted%3a%20False%2c%20SkipValidation%3a%20True).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	25.62 ms	28.98 ms	1.13	0.15	False

graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Text.Json.Tests.Perf_Guids*'

### Payloads [Baseline]() [Compare]() ### System.Text.Json.Tests.Perf_Guids.WriteGuids(Formatted: False, SkipValidation: True) #### ETL Files #### Histogram #### JIT Disasms ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

kg commented 4 months ago

Likely https://github.com/dotnet/runtime/pull/99273. The size of the regression makes me think it's probably because the old (broken) heuristic was not inserting traces at the "right" places, and the traces it's inserting for these scenarios aren't profitable. May investigate the worst ones.

EDIT: Most of these regressions seem to be in the tensor code, which looks to be extremely generic code that wraps vector operators, and we also know as an existing thing that

we don't implement most of the vector ops on wasm interp yet, and may never implement them all
perf for scalar fallback on vector ops regressed recently (this may improve soon)

kg commented 4 months ago

The Trim and Substring ones look like they were probably impacted by the changes that introduced more safepoints into jiterpreter traces, so the recent fix to remove safepoints may make those go away. Looking at a quick profile they seem to spend half their execution time in traces, though a good chunk of that is dominated by time spent allocating strings: Looking at the opcodes for the traces in question there are quite a few imm safepoint branch opcodes in there which were introduced recently by an interpreter optimization. The jiterp should now handle those opcodes in a more efficient way.

radekdoulik commented 4 months ago

the vector performance indeed regressed recently, the firefox impact is a bit higher than chrome's one. https://radekdoulik.github.io/WasmPerformanceMeasurements/?startDate=2024-02-27T23%3A01%3A28.000Z&endDate=2024-03-12T22%3A53%3A01.000Z&tasks=%2CVector&flavors=2%2C3%2C14%2C15

kg commented 4 months ago

I'm hoping https://github.com/dotnet/runtime/pull/99706 will claw back a lot of the vector perf if it lands, though it's impossible for me to measure locally (too much noise)

dotnet / perf-autofiling-issues