[Perf] Linux/x64: 4 Regressions on 9/21/2023 8:08:31 PM

performanceautofiler[bot] commented 1 year ago

Run Information

Name	Value
Architecture	x64
OS	ubuntu 22.04
Queue	TigerUbuntu
Baseline	05730a37fff1fea8ac01cecc60b55f28da0b1db4
Compare	f671fde591fc6ebd2ec8cfc946c01c663f236bc6
Diff	Diff
Configs	CompilationMode:tiered, LLVM:false, MonoAOT:true, MonoInterpreter:false, RunKind:micro_mono

Regressions in System.Collections.Tests.Perf_BitArray

Benchmark	Baseline	Test	Test/Base	Test Quality	Edge Detector
[BitArrayCopyToIntArray - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_LLVM=false_MonoAOT=true_MonoInterpreter=false_RunKind=micro_mono/System.Collections.Tests.Perf_BitArray.BitArrayCopyToIntArray(Size%3a%204).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	41.65 ns	55.19 ns	1.33	0.22	True
[BitArrayNot - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_LLVM=false_MonoAOT=true_MonoInterpreter=false_RunKind=micro_mono/System.Collections.Tests.Perf_BitArray.BitArrayNot(Size%3a%204).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	4.09 ns	5.64 ns	1.38	0.24	False
[BitArrayCopyToIntArray - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_LLVM=false_MonoAOT=true_MonoInterpreter=false_RunKind=micro_mono/System.Collections.Tests.Perf_BitArray.BitArrayCopyToIntArray(Size%3a%20512).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	56.24 ns	68.83 ns	1.22	0.28	True
[BitArrayCopyToByteArray - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 22.04_LLVM=false_MonoAOT=true_MonoInterpreter=false_RunKind=micro_mono/System.Collections.Tests.Perf_BitArray.BitArrayCopyToByteArray(Size%3a%204).html>) 📝 - Benchmark Source 📈 - ADX Test Multi Config Graph	32.40 ns	42.63 ns	1.32	0.39	False

graph graph graph graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Repro Steps

#### Prerequisites (Files either built locally (with build.(sh/cmd) or downloaded from payload above (if same system setup) (in this order)) - Libraries build extracted to `runtime/artifacts` or build instructions: [Libraries README](https://github.com/dotnet/runtime/blob/main/docs/workflow/building/libraries/README.md) args: `-subset libs+libs.tests -rc release -configuration Release -arch $RunArch -framework net8.0` - CoreCLR product build extracted to `runtime/artifacts/bin/coreclr/$RunOS.$RunArch.Release`, build instructions: [CoreCLR README](https://github.com/dotnet/runtime/blob/main/docs/workflow/building/coreclr/README.md) args: `-subset clr+libs -rc release -configuration Release -arch $RunArch -framework net8.0` - AOT MONO build extracted to `runtime/artifacts/bin/mono/$RunOS.$RunArch.Release`, build instructions: [MONO README](https://github.com/dotnet/runtime/blob/main/docs/workflow/building/mono/README.md) args: `-arch $RunArch -os $RunOS -s mono+libs+host+packs -c Release /p:CrossBuild=false /p:MonoLLVMUseCxx11Abi=false` - Dotnet SDK installed for dotnet commands - Running commands from the runtime folder Linux ```cmd # Set $RunDir to the runtime directory RunDir=`pwd` # Set the OS, arch, and OSId RunOS='linux' RunOSId='linux' RunArch='x64' # Create aot directory mkdir -p $RunDir/artifacts/bin/aot/sgen mkdir -p $RunDir/artifacts/bin/aot/pack cp -r $RunDir/artifacts/obj/mono/$RunOS.$RunArch.Release/mono/* $RunDir/artifacts/bin/aot/sgen cp -r $RunDir/artifacts/bin/microsoft.netcore.app.runtime.$RunOS-$RunArch/Release/* $RunDir/artifacts/bin/aot/pack # Create Core Root $RunDir/src/tests/build.sh release $RunArch generatelayoutonly /p:LibrariesConfiguration=Release # Clone performance git clone --branch main --depth 1 --quiet https://github.com/dotnet/performance.git $RunDir/performance # One line run: python3 $RunDir/performance/scripts/benchmarks_ci.py --csproj $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --incremental no --architecture $RunArch -f net8.0 --filter 'System.Collections.Tests.Perf_BitArray*' --bdn-artifacts $RunDir/artifacts/BenchmarkDotNet.Artifacts --bdn-arguments="--anyCategories Libraries Runtime --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir/artifacts/bin/aot/sgen/mini/mono-sgen --customruntimepack $RunDir/artifacts/bin/aot/pack --aotcompilermode llvm --logBuildOutput --generateBinLog" # Individual Commands: # Restore dotnet restore $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --packages $RunDir/performance/artifacts/packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1 # Build dotnet build $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore /p:NuGetPackageRoot=$RunDir/performance/artifacts/packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1 # Run dotnet run --project $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore --no-build -- --filter System.Collections.Tests.Perf_BitArray* --anyCategories Libraries Runtime " --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir/artifacts/bin/aot/sgen/mini/mono-sgen --customruntimepack $RunDir/artifacts/bin/aot/pack --aotcompilermode llvm --logBuildOutput --generateBinLog " --artifacts $RunDir/artifacts/BenchmarkDotNet.Artifacts --packages $RunDir/performance/artifacts/packages --buildTimeout 1200 ``` Windows ```cmd # Set $RunDir to the runtime directory $RunDir="FullPathHere" # Set the OS, arch, and OSId RunOS='windows' RunOSId='win' RunArch='x64' # Create aot directory mkdir $RunDir\artifacts\bin\aot\sgen mkdir $RunDir\artifacts\bin\aot\pack xcopy $RunDir\artifacts\obj\mono\$RunOS.$RunArch.Release\mono $RunDir\artifacts\bin\aot\sgen\ /e /y xcopy $RunDir\artifacts\bin\microsoft.netcore.app.runtime.$RunOSId-$RunArch\Release $RunDir\artifacts\bin\aot\pack\ /e /y # Create Core Root $RunDir\src\tests\build.cmd release $RunArch generatelayoutonly /p:LibrariesConfiguration=Release # Clone performance git clone --branch main --depth 1 --quiet https://github.com/dotnet/performance.git $RunDir\performance # One line run: python3 $RunDir\performance\scripts\benchmarks_ci.py --csproj $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --incremental no --architecture $RunArch -f net8.0 --filter 'System.Collections.Tests.Perf_BitArray*' --bdn-artifacts $RunDir\artifacts\BenchmarkDotNet.Artifacts --bdn-arguments="--anyCategories Libraries Runtime --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir\artifacts\bin\aot\sgen\mini\mono-sgen.exe --customruntimepack $RunDir\artifacts\bin\aot\pack --aotcompilermode llvm --logBuildOutput --generateBinLog" # Individual Commands: # Restore dotnet restore $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --packages $RunDir\performance\artifacts\packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1 # Build dotnet build $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore /p:NuGetPackageRoot=$RunDir\performance\artifacts\packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1 # Run dotnet run --project $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore --no-build -- --filter System.Collections.Tests.Perf_BitArray* --anyCategories Libraries Runtime " --category-exclusion-filter NoAOT NoWASM --runtimes monoaotllvm --aotcompilerpath $RunDir\artifacts\bin\aot\sgen\mini\mono-sgen.exe --customruntimepack $RunDir\artifacts\bin\aot\pack -aotcompilermode llvm --logBuildOutput --generateBinLog " --artifacts $RunDir\artifacts\BenchmarkDotNet.Artifacts --packages $RunDir\performance\artifacts\packages --buildTimeout 1200 ```

### Payloads [Baseline]() [Compare]() ### System.Collections.Tests.Perf_BitArray.BitArrayCopyToIntArray(Size: 4) #### ETL Files #### Histogram #### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 55.18958625532268 > 44.534234663831185. IsChangePoint: Marked as a change because one of 9/21/2023 3:30:33 PM, 9/25/2023 10:28:34 PM falls between 9/17/2023 8:50:51 AM and 9/25/2023 10:28:34 PM. IsRegressionStdDev: Marked as regression because -34.05752289530848 (T) = (0 -54.56601283160715) / Math.Sqrt((1.6761103038543508 / (29)) + (1.3916944330063579 / (20))) is less than -2.011740513728388 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (29) + (20) - 2, .025) and -0.2866098459625133 = (42.41069116860854 - 54.56601283160715) / 42.41069116860854 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked as regression because Edge Detector said so. ``` #### JIT Disasms ### System.Collections.Tests.Perf_BitArray.BitArrayNot(Size: 4) #### ETL Files #### Histogram #### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 5.635930036727441 > 4.336463912238777. IsChangePoint: Marked as a change because one of 9/21/2023 3:30:33 PM, 9/25/2023 10:28:34 PM falls between 9/17/2023 8:50:51 AM and 9/25/2023 10:28:34 PM. IsRegressionStdDev: Marked as regression because -7.119130858816127 (T) = (0 -4.750859910024883) / Math.Sqrt((0.034107977773314715 / (29)) + (0.09927776149170037 / (19))) is less than -2.0128955989180297 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (29) + (19) - 2, .025) and -0.1362234934001443 = (4.1812723796160505 - 4.750859910024883) / 4.1812723796160505 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### JIT Disasms ### System.Collections.Tests.Perf_BitArray.BitArrayCopyToIntArray(Size: 512) #### ETL Files #### Histogram #### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 68.83133437358926 > 61.13329144098798. IsChangePoint: Marked as a change because one of 9/21/2023 3:30:33 PM, 9/25/2023 10:28:34 PM falls between 9/17/2023 8:50:51 AM and 9/25/2023 10:28:34 PM. IsRegressionStdDev: Marked as regression because -11.120592725073191 (T) = (0 -71.23524133026514) / Math.Sqrt((2.334788335972653 / (29)) + (29.17140783491009 / (22))) is less than -2.0095752371279447 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (29) + (22) - 2, .025) and -0.2272046885895151 = (58.04674802223841 - 71.23524133026514) / 58.04674802223841 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked as regression because Edge Detector said so. ``` #### JIT Disasms ### System.Collections.Tests.Perf_BitArray.BitArrayCopyToByteArray(Size: 4) #### ETL Files #### Histogram #### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 42.633050702363505 > 33.528971751591314. IsChangePoint: Marked as a change because one of 9/21/2023 3:30:33 PM, 9/25/2023 10:28:34 PM falls between 9/17/2023 8:50:51 AM and 9/25/2023 10:28:34 PM. IsRegressionStdDev: Marked as regression because -7.069488145817146 (T) = (0 -46.198072913585236) / Math.Sqrt((32.027970797203764 / (29)) + (44.23029614596613 / (20))) is less than -2.011740513728388 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (29) + (20) - 2, .025) and -0.386298945804742 = (33.32475513552916 - 46.198072913585236) / 33.32475513552916 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### JIT Disasms ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

LeVladIonescu commented 1 year ago

It can be caused by https://github.com/dotnet/runtime/commit/3470c4caa2f216e8fde619128759eda3ba9c6206 because mono doesn't support Vector512 and AVX intrinsics. @tannergooding @fanyang-mono

tannergooding commented 1 year ago

@LeVladIonescu why would that regress mono?

The new code paths are behind IsHardwareAccelerated checks, which should be false if mono doesn't support them

So there should be zero change on the mono side

matouskozak commented 1 year ago

@tannergooding I verified locally (Ubuntu x64) using https://github.com/dotnet/performance/blob/main/scripts/benchmarks_local.py and these are the results:

Microbenchmark	51498964d35068ded252ec98baf4fe6ebf6a4612	3470c4caa2f216e8fde619128759eda3ba9c6206
BitArrayCopyToIntArray (size 4)	37.71	45.76
BitArrayCopyToByteArray (size 4)	47.02	48.39
BitArrayCopyToIntArray (size 512)	49.56	60.24

There is a measurable diference between these commits. I agree that the changes are behind IsHardwareAccelerated checks so Mono missing intrinsics shouldn't be the cause. However, the introduced Vector128<T>.Count calls could bring the performance down a little bit. I will try to get a codegen before and with the change and will get back to you.

matouskozak commented 11 months ago

Findings so far:

It seems to impact only scenarios were mini is used (AOT-llvm should be unaffected). Benchmarks in this issue ran using AOT-mini as was identified in https://github.com/dotnet/runtime/pull/92644.
From local measurements the most affected seem to be BitArrayCopyToIntArray which is weird since the code for CopyTo with integers is unchanged by https://github.com/dotnet/runtime/commit/3470c4caa2f216e8fde619128759eda3ba9c6206. This suggest that the root cause of regression might be a different commit.
The regressions appeared on JIT-mini as well https://github.com/dotnet/perf-autofiling-issues/issues/22574. However, the range is significantly longer https://github.com/dotnet/runtime/compare/736dabeca728ccf8b911d96d1b4c575b4d0db7d2...169e22c8f9f00719d87f0674954fee688b556b4a so identifying the exact commit will not be easy.

matouskozak commented 2 months ago

No longer regressed in .NET 9

dotnet / perf-autofiling-issues