Open performanceautofiler[bot] opened 3 years ago
@AndyAyersMS
Benchmark-interval only analysis for NoArguments
(from ETL) suggests this is something in the core runtime itself.
Baseline
Samples for corerun: 7193 events for Benchmark Intervals
Jitting : 00.24% 9E+06 samples 1573 methods
JitInterface : 00.08% 3E+06 samples
Jit-generated code: 59.63% 2.27E+09 samples
Jitted code : 59.63% 2.27E+09 samples
MinOpts code : 00.00% 0 samples
FullOpts code : 17.42% 6.62E+08 samples
Tier-0 code : 00.00% 0 samples
Tier-1 code : 42.21% 1.6E+09 samples
R2R code : 00.00% 0 samples
02.92% 1.11E+08 ? Unknown
37.13% 1.411E+09 native coreclr.dll
16.55% 6.29E+08 FullOpt [Microsoft.Extensions.Logging]Logger.Log(value class Microsoft.Extensions.Logging.LogLevel,value class Microsoft.Extensions.Logging.EventId,!!0,class System.Exception,class System.Func`3<!!0,class System.Exception,class System.String>)
12.00% 4.56E+08 Tier-1 [Microsoft.Extensions.Logging.Abstractions]LoggerExtensions.Log(class Microsoft.Extensions.Logging.ILogger,value class Microsoft.Extensions.Logging.LogLevel,value class Microsoft.Extensions.Logging.EventId,class System.Exception,class System.String,class System.Object[])
11.08% 4.21E+08 Tier-1 [Microsoft.Extensions.Logging]Logger.<Log>g__LoggerLog|12_0(value class Microsoft.Extensions.Logging.LogLevel,value class Microsoft.Extensions.Logging.EventId,class Microsoft.Extensions.Logging.ILogger,class System.Exception,class System.Func`3<!!0,class System.Exception,class System.String>,class System.Collections.Generic.List`1<class System.Exception>&,!!0&)
07.53% 2.86E+08 Tier-1 [Microsoft.Extensions.Logging.Abstractions]FormattedLogValues..ctor(class System.String,class System.Object[])
06.05% 2.3E+08 Tier-1 [MicroBenchmarks]LoggingOverhead.NoArguments()
04.68% 1.78E+08 Tier-1 [Microsoft.Extensions.Logging]MessageLogger.IsEnabled(value class Microsoft.Extensions.Logging.LogLevel)
00.87% 3.3E+07 FullOpt [c5630227-85b0-454f-88ac-8126d9409a39]Runnable_0.WorkloadActionUnroll(int64)
00.84% 3.2E+07 Tier-1 [MicroBenchmarks]LoggingBenchmarkBase+NoopLogger.Log(value class Microsoft.Extensions.Logging.LogLevel,value class Microsoft.Extensions.Logging.EventId,!!0,class System.Exception,class System.Func`3<!!0,class System.Exception,class System.String>)
00.16% 6E+06 native clrjit.dll
00.08% 3E+06 native ntoskrnl.exe
00.05% 2E+06 native intelppm.sys
Change
Samples for corerun: 7797 events for Benchmark Intervals
Jitting : 00.26% 1E+07 samples 1573 methods
JitInterface : 00.03% 1E+06 samples
Jit-generated code: 54.03% 2.08E+09 samples
Jitted code : 54.03% 2.08E+09 samples
MinOpts code : 00.00% 0 samples
FullOpts code : 16.07% 6.18E+08 samples
Tier-0 code : 00.00% 0 samples
Tier-1 code : 37.96% 1.46E+09 samples
R2R code : 00.00% 0 samples
01.38% 5.3E+07 ? Unknown
44.25% 1.702E+09 native coreclr.dll
15.21% 5.85E+08 FullOpt [Microsoft.Extensions.Logging]Logger.Log(value class Microsoft.Extensions.Logging.LogLevel,value class Microsoft.Extensions.Logging.EventId,!!0,class System.Exception,class System.Func`3<!!0,class System.Exception,class System.String>)
10.95% 4.21E+08 Tier-1 [Microsoft.Extensions.Logging]Logger.<Log>g__LoggerLog|12_0(value class Microsoft.Extensions.Logging.LogLevel,value class Microsoft.Extensions.Logging.EventId,class Microsoft.Extensions.Logging.ILogger,class System.Exception,class System.Func`3<!!0,class System.Exception,class System.String>,class System.Collections.Generic.List`1<class System.Exception>&,!!0&)
10.69% 4.11E+08 Tier-1 [Microsoft.Extensions.Logging.Abstractions]LoggerExtensions.Log(class Microsoft.Extensions.Logging.ILogger,value class Microsoft.Extensions.Logging.LogLevel,value class Microsoft.Extensions.Logging.EventId,class System.Exception,class System.String,class System.Object[])
06.73% 2.59E+08 Tier-1 [Microsoft.Extensions.Logging.Abstractions]FormattedLogValues..ctor(class System.String,class System.Object[])
04.50% 1.73E+08 Tier-1 [Microsoft.Extensions.Logging]MessageLogger.IsEnabled(value class Microsoft.Extensions.Logging.LogLevel)
04.39% 1.69E+08 Tier-1 [MicroBenchmarks]LoggingOverhead.NoArguments()
00.86% 3.3E+07 FullOpt [86029241-0098-4ac1-a544-cffed1af7a77]Runnable_0.WorkloadActionUnroll(int64)
00.70% 2.7E+07 Tier-1 [MicroBenchmarks]LoggingBenchmarkBase+NoopLogger.Log(value class Microsoft.Extensions.Logging.LogLevel,value class Microsoft.Extensions.Logging.EventId,!!0,class System.Exception,class System.Func`3<!!0,class System.Exception,class System.String>)
00.23% 9E+06 native clrjit.dll
00.08% 3E+06 native ntoskrnl.exe
Benchmark: found 15 intervals; mean interval 255.670ms
Perfview can't find the symbols for coreclr for those two traces so we can't drill in further.
With symbols, the diff seems to be all in JIT_VirtualFunctionPointer
. Source for this doesn't appear to have changed during this time span.
It could be that this helper is just called more often; the callers are logging methods. Source for those callers hasn't changed either.
Suggests we need to look for recent jit codegen diffs in the key managed methods.
I don't see anything in the commit history that jumps out as a likely culprit.
Am building before/after runtimes so I can drill in further.
I don't see any jit codegen changes, but still need to make sure I'm looking at the right version of the logging dll.
I can repro this locally. Still investigating.
Method | Job | Toolchain | Mean |
---|---|---|---|
NoArguments | Job-AILDUX | \runtime1\artifacts\tests\coreclr\Windows_NT.x64.Release\Tests\Core_Root\corerun.exe | 25.08 ns |
NoArguments | Job-MJIGDO | \runtime2\artifacts\tests\coreclr\windows.x64.Release\Tests\Core_Root\corerun.exe | 28.48 ns |
Seems like perhaps we're just taking longer to look things up in the generic handle cache? Need to check how much of the lookup code gets inlined into JIT_VirtualFunctionPointer.
@VSadov can you think of any recent changes (Oct 29-Nov 4) that might have caused something like this? I don't see anything in jit or runtime.
Nothing really stands out from the diffs.
There was some change for the JIT sorting, switching from qsort to a stable sort. But that should either affect nothing or affect a lot of scenarios. I can't see how it would impact just this.
JIT_VirtualFunctionPointer is just a hash lookup and I do not see anything changed there in an obvious way. And regression there would probably affect more cases.
New hypothesis: perhaps the windows_nt -> windows change broke how PGO is applied to the runtime, and these tests are particularly sensitive to native PGO?
Trying to verify now.
Seems plausible it was ? From the build logs I have on hand:
;; hash 0199252802537aa4d8b393c3923b4d4cfe371872 (Oct 29)
3561 of 4076 (87.4%) original invalid call sites were matched.
4655 new call sites were added.
2324 of 33506 ( 6.94%) profiled functions will be compiled for speed, and the rest of the functions will be compiled for size
150853 of 336462 inline instances were from dead/cold paths
33226 of 33887 functions (98.0%) were optimized using profile data, and the rest of the functions were optimized without using profile data
49189162072 of 51438814196 instructions (95.6%) were optimized using profile data, and the rest of the instructions were optimized without using profile data
;; hash 03701ba6d2656c1fb52848744980c7b799aabccf (Nov 6)
3593 of 4111 (87.4%) original invalid call sites were matched.
4678 new call sites were added.
194 of 33461 ( 0.58%) profiled functions will be compiled for speed, and the rest of the functions will be compiled for size
140438 of 315786 inline instances were from dead/cold paths
33173 of 33887 functions (97.9%) were optimized using profile data, and the rest of the functions were optimized without using profile data
49184524871 of 51438814196 instructions (95.6%) were optimized using profile data, and the rest of the instructions were optimized without using profile data
Note how the number of methods compiled for speed dropped dramatically.
Am going to try building at dotnet/runtime@e691753245eb6f2d459c7f02aaaa511077c1e833 (before) and dotnet/runtime@b37f10a504f20faef2558b6d79180411ad0603c3 (after) to verify.
Both those hashes show the PGO data is not applied effectively, so the impactful change was sometime before. Am going to keep looking.
Next to scrutinize:
Looks like it was dotnet/runtime@8c6a049cdb4bcd958cc3487775174bdd32e5fb64 (dotnet/runtime#41897).
Suspect some of the build flags got changed and this disqualified some methods from leveraging existing PGO data; am going to try and verify by extracting the flags for jithelpers.obj.
Only real flag diffs I see are for PCH; that might be enough as the code we inline into JIT_VirtualFunctionPointer
likely comes from PCH...
So seems likely that if we regenerate native PGO data we'll get this perf back.
@brianrob do you know when we next expect to see a PGO update for the runtime repo? @jkoritzinsky FYI
Ownership of PGO in the runtime repo is currently under discussion, so I don't currently have a timeline. Once this gets resolved, I'd expect updates to start coming.
Run Information
Regressions in Microsoft.Extensions.Logging.LoggingOverhead
Related Issue on x64 ubuntu
[Perf 21%] Microsoft.Extensions.Logging.LoggingOverhead (2)
Historical Data in Reporting System
Repro