Closed performanceautofiler[bot] closed 2 months ago
Tagging subscribers to this area: @directhex, @matouskozak See info in area-owners.md if you want to be subscribed.
Tagging subscribers to 'os-ios': @vitek-karas, @kotlarmilos, @ivanpovazan, @steveisok, @akoeplinger See info in area-owners.md if you want to be subscribed.
Tagging subscribers to 'size-reduction': @eerhardt, @SamMonoRT, @marek-safar See info in area-owners.md if you want to be subscribed.
The range of regression is https://github.com/dotnet/runtime/compare/105bf4dd4dedfe9dfb067427cbca34f76ddfaa44...68511fd27fe4055ce5203742998ba12019dfcbd4. ~@kg do you think that https://github.com/dotnet/runtime/pull/106080 could have affected this? The range is quite short and this looks to be the only Mono-related commit.~
I did manual verification and the causing commit is https://github.com/dotnet/runtime/pull/106014 @noahfalk
The same range is causing ~100kB size regression on Android sample app as well
Tagging subscribers to 'arch-android': @vitek-karas, @simonrozsival, @steveisok, @akoeplinger See info in area-owners.md if you want to be subscribed.
@noahfalk @tarekgh the https://github.com/dotnet/runtime/pull/106415 didn't have impact on Android/iOS size as expected
fyi: @vitek-karas
What diagnostics are available to determine the APIs that are no longer being trimmed and the rooting path preventing their removal? I am unfamiliar with how these size issues should be diagnosed.
If you know which API was left in the binary and shouldn't have been then there are several ways to tell why it was kept:
To figure out which API is there and shouldn't be is typically a much more difficult problem and requires domain knowledge. For NativeAOT we can use https://github.com/MichalStrehovsky/sizoscope to diff two binaries, but trimmer doesn't support that yet.
I can make guesses that the APIs causing the size increase are either EventSource APIs or Meter APIs. I don't know how to verify those guesses. I'm not sure what app this size measuring automation is building and how I would reproduce it locally.
I'm trying to get the IL of the trimmed app - before/after - then we'll have something to work with.
So the comparison is pretty clear: Before the app only has S.P.CoreLib and S.Console. After the app also has S.Diagnostics.DiagnosticSource and S.Collections.Concurrent. which add ~100KB of IL.
I'll look into it some more this happens and where it came from.
So I found first bug - the substitution doesn't define default value, so if the feature switch is not defined at all (which is the default), no substitution happens. We'll have to add featuredefaultvalue to the XML to fix this. Unfortunately even if I work around this I still can't get the app to smaller size. I can see the substitution happening correctly, but it's as if something else pulls in the code as well - still looking (first time doing some real work on a mac ... and it hurts :-))
but it's as if something else pulls in the code as well
As far as I know the only link from S.P.CoreLib -> DiagnosticSource is the Type.GetType() call here. That call is supposed to be guarded by both the IsMeterSupported
and EventSource.IsSupported
checks, but of course maybe it isn't working as intended. If you want to you could comment out the call to PreregisterEventProviders
at 3843 + comment out all of GetMetricsEventSource() and rebuild to confirm that is the code that is causes the dependency.
OK - I think I understand this fully. The behavior is the same on Windows as well BTW.
If I compare P6 and RC1 - win-x64 NativeAOT (so not just normal trimming)
P6 - 1274368 bytes
RC1 - 1349120 bytes
So we regressed the size of console hello world by 75KB, so by almost 6%.
The reason is the same, we now include parts of System.Diagnostics.DiagnosticSource.dll
and System.Collections.Concurrent.dll
, before the change these assemblies were completely trimmed away.
Currently both System.Diagnostics.Tracing.EventSource.IsSupported
and System.Diagnostics.Metrics.Meter.IsSupported
default to on for Windows (and for iOS as well).
As such this regression is currently "by design".
If I turn off System.Diagnostics.Metrics.Meter.IsSupported
(but leave EventSource on), the trimming doesn't really work. With the recent feature switch fix, we correctly trim away the callsite, but the code behind it is still kept. This is because EventSource
has DAM.All
annotation on it and if the EventSource.IsSupported
is enabled, there's a code which makes use of it EnsureDescriptorsInitialized
- so the trimmer must keep everything on EventSource
- this brings in the GetMetricsEventSource
method which in turn brings in the additional assemblies. So even though at runtime this code is unreachable, trimmer is instructed to keep it anyway.
To fix this, the cleanest solution would be to create a separate type from EventSource
, and move the code which pulls in the metrics stuff there. That will avoid the DAM.All
pulling in that code even if there's no callsite.
Side note: The fact that EventSource
has DAM.All
on it means, that if EventSource.IsSupported
is on (which is the default basically everywhere), everything on that class is preserved - not sure if there are other trim opportunities we've missed because of this.
The second interesting question is - Do we need/want to enable metrics by default - on Windows console, on iOS, on other targets. I don't know what functionality it provides and if it's something every app should have enabled by default. @noahfalk maybe you could point us to some reading on this, so that we can decide if it makes sense to enable it for mobile targets. I'll leave the Windows/Linux/macOS targets up to you :-)
AFAIK we're not aware of NativeAOT size regression due to this change - right @MichalStrehovsky . That raises the question which infra should have caught it and why it didn't.
We'll have to live with this for RC1, but we should try to answer all these questions for RC2.
Good catch with the EnsureDescriptorsInitialized
being considered a target of reflection (and therefore not removed at all) because EventSource
reflects on its methods and methods of the descendants.
I think moving that method out into a separate type sounds like a good fix.
I'm not seeing the size regression for native AOT hello world on Windows. I see 1,104,384 bytes with P6 on a dotnet new console --aot
and 1,099,264 bytes with the latest main.
We default EventSourceSupport to false
on native AOT. Some SDKs like ASP.NET override the default to true
.
I do see regression on a console hello world with EventSourceSupport
overriden to true
(from 2,842,624 bytes to 3,149,824 bytes). There seems to be tons of new metrics stuff. The regression was smaller for ASP.NET that was already calling into a lot of the metrics (the regression was about 30 kB because the only new "expensive" API getting called was Type.GetType. Our instrumentation is not very lightweight.
It would be nice if someone who understands metrics to confirm the 300 kB regression is expected. Download and unpack this zip: beforeafter.zip
Then a Windows machine, run dotnet tool install sizoscope --global
. Then run sizoscope
and drag and drop before/testhello.mstat into the UI. Then while holding Alt key, drag and drop after/testhello.mstat. You'll see a UI that shows new things in a tree on the right side. You can double click things to see why the new thing was included. For example:
Notice the nodes in a different color only exists in "compare", not in "baseline". So for example in this case a new Dictionary<__Canon,bool>
can be attributed to InitializeDefaultEventSources
.
Thanks for investigating!
To fix this, the cleanest solution would be to create a separate type from EventSource, and move the code which pulls in the metrics stuff there. That will avoid the DAM.All pulling in that code even if there's no callsite.
Can the separate type be an inner class or will DAM.All pull that in too? I'm fine to do it as a top-level class if necessary, but inner class seemed a little nicer if the only point of the class is to get the trimmer behavior we want.
@noahfalk maybe you could point us to some reading on this, so that we can decide if it makes sense to enable it for mobile targets. I'll leave the Windows/Linux/macOS targets up to you :-)
These docs should give a good idea what scenarios you lose out on if you disable Meter:
Metrics are most prevalent in web-services, but they can also be useful for ad-hoc performance investigations in all types of apps. In terms of defaults, my initial thought is that turning on/off EventSource seems like a better lever to use than Meter. If you are planning to turn Meter off by default and leave EventSource on I'd be curious to better understand the rationale.
It would be nice if someone who understands metrics to confirm the 300 kB regression is expected.
According to sizoscope it showed me the regression as 250.4 rather than 300, but aside from that yeah, the new dependencies from metrics seemed right.
Our instrumentation is not very lightweight.
Yeah, size optimization for instrumentation has never been a goal we've prioritized. The feedback that I see has been mostly about adding new capabilities, integration, standardization and performance at runtime. In many cases I suspect the perf at runtime has direct tradeoffs against code size as the fastest/lowest alloc implementations tend to be heavier on custom structs and generic instantiations of code to process them. I'm happy to chat more about it any time.
According to sizoscope it showed me the regression as 250.4 rather than 300, but aside from that yeah, the new dependencies from metrics seemed right.
Sizoscope doesn't have exact accounting of things - it pretty much always underreports because some bytes of the output are too difficult to attribute but those bytes are generally related to things that are visible in the UI.
Thank you for checking!
Can the separate type be an inner class or will DAM.All pull that in too?
This would need to be a separate class - .All
includes nested types for better or worse (in this case for better, because event source reflects at nested types too).
Haha, I didn't even realize inner class reflection was in EventSource's bag of tricks. I'll have to tuck that tid-bit away somewhere. Thanks for the info! I'll put together another change that moves the code into a separate top-level internal class soon, hopefully tomorrow.
Run Information
Regressions in SOD - iOS HelloWorld Mono .app Size llvm nosymbols
Test Report
Repro
General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md
Repro Steps
#### Prerequisites (Build files either built locally or downloaded from payload above) - Libraries build extracted to `runtime/artifacts` or build instructions: [Libraries README](https://github.com/dotnet/runtime/blob/main/docs/workflow/building/libraries/README.md) args: `-subset libs+libs.tests -rc release -configuration Release -arch $RunArch -framework net8.0` - CoreCLR product build extracted to `runtime/artifacts/bin/coreclr/$RunOS.$RunArch.Release`, build instructions: [CoreCLR README](https://github.com/dotnet/runtime/blob/main/docs/workflow/building/coreclr/README.md) args: `-subset clr+libs -rc release -configuration Release -arch $RunArch -framework net8.0` - Mono Runtime build extracted to `runtime/artifacts/bin/mono/$RunOS.$RunArch.Release`, build instructions: [MONO README](https://github.com/dotnet/runtime/blob/main/docs/workflow/building/mono/README.md) args: `-arch $RunArch -os $RunOS -s mono+libs+host+packs -c Release ` - Dotnet SDK installed for dotnet commands - Running commands from the runtime folder Linux ```cmd # Set $RunDir to the runtime directory RunDir=`pwd` # Set the OS, arch, and OSId RunOS='linux' RunOSId='linux' RunArch='x64' # Create mono dotnet mkdir -p $RunDir/artifacts/dotnet-mono $RunDir/build.sh -subset libs.pretest -configuration release -ci -arch $RunArch -testscope innerloop /p:RuntimeArtifactsPath=$RunDir/artifacts/bin/mono/$RunOS.$RunArch.Release /p:RuntimeFlavor=mono cp $RunDir/artifacts/bin/runtime/net8.0-$RunOS-Release-$RunArch/* $RunDir/artifacts/bin/testhost/net8.0-$RunOS-Release-$RunArch/shared/Microsoft.NETCore.App/8.0.0 -rf cp $RunDir/artifacts/bin/testhost/net8.0-$RunOS-Release-$RunArch/* $RunDir/artifacts/dotnet-mono -r cp $RunDir/artifacts/bin/coreclr/$RunOS.$RunArch.Release/corerun $RunDir/artifacts/dotnet-mono/shared/Microsoft.NETCore.App/8.0.0/corerun # Create Core Root $RunDir/src/tests/build.sh release $RunArch generatelayoutonly /p:LibrariesConfiguration=Release # Clone performance git clone --branch main --depth 1 --quiet https://github.com/dotnet/performance.git $RunDir/performance # One line run: python3 $RunDir/performance/scripts/benchmarks_ci.py --csproj $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --incremental no --architecture $RunArch -f net8.0 --filter 'SOD - iOS HelloWorld Mono .app Size llvm nosymbols*' --bdn-artifacts $RunDir/artifacts/BenchmarkDotNet.Artifacts --bdn-arguments="--anyCategories Libraries Runtime --category-exclusion-filter NoInterpreter NoMono --logBuildOutput --generateBinLog --corerun $RunDir/artifacts/dotnet-mono/shared/Microsoft.NETCore.App/8.0.0/corerun" # Individual Commands: # Restore dotnet restore $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --packages $RunDir/performance/artifacts/packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1 # Build dotnet build $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore /p:NuGetPackageRoot=$RunDir/performance/artifacts/packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1 # Run dotnet run --project $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore --no-build -- --filter 'SOD - iOS HelloWorld Mono .app Size llvm nosymbols*' --anyCategories Libraries Runtime " --category-exclusion-filter NoInterpreter NoMono --logBuildOutput --generateBinLog --corerun $RunDir/artifacts/dotnet-mono/shared/Microsoft.NETCore.App/8.0.0/corerun --artifacts $RunDir/artifacts/BenchmarkDotNet.Artifacts --packages $RunDir/performance/artifacts/packages ``` Windows ```cmd # Set $RunDir to the runtime directory $RunDir="FullPathHere" # Set the OS, arch, and OSId RunOS='windows' RunOSId='win' RunArch='x64' # Create mono dotnet mkdir -p $RunDir/artifacts/dotnet-mono $RunDir/build.sh -subset libs.pretest -configuration release -ci -arch $RunArch -testscope innerloop /p:RuntimeArtifactsPath=$RunDir\artifacts\bin\mono\$RunOS.$RunArch.Release /p:RuntimeFlavor=mono xcopy $RunDir\artifacts\bin\runtime\net8.0-$RunOS-Release-$RunArch\ $RunDir\artifacts\bin\testhost\net8.0-$RunOS-Release-$RunArch\shared\Microsoft.NETCore.App\8.0.0\ /e /y xcopy $RunDir\artifacts\bin\testhost\net8.0-$RunOS-Release-$RunArch\ $RunDir\artifacts\dotnet-mono\ /e /y xcopy $RunDir\artifacts\bin\coreclr\$RunOS.$RunArch.Release\corerun $RunDir\artifacts\dotnet-mono\shared\Microsoft.NETCore.App\8.0.0\corerun /y # Create Core Root $RunDir\src\tests\build.cmd release $RunArch generatelayoutonly /p:LibrariesConfiguration=Release # Clone performance git clone --branch main --depth 1 --quiet https://github.com/dotnet/performance.git $RunDir\performance # One line run: python3 $RunDir\performance\scripts\benchmarks_ci.py --csproj $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --incremental no --architecture $RunArch -f net8.0 --filter 'SOD - iOS HelloWorld Mono .app Size llvm nosymbols*' --bdn-artifacts $RunDir\artifacts\BenchmarkDotNet.Artifacts --bdn-arguments="--anyCategories Libraries Runtime --category-exclusion-filter NoInterpreter NoMono --logBuildOutput --generateBinLog --corerun $RunDir\artifacts\dotnet-mono\shared\Microsoft.NETCore.App\8.0.0\corerun.exe" # Individual Commands: # Restore dotnet restore $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --packages $RunDir\performance\artifacts\packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1 # Build dotnet build $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore /p:NuGetPackageRoot=$RunDir\performance\artifacts\packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1 # Run dotnet run --project $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore --no-build -- --filter 'SOD - iOS HelloWorld Mono .app Size llvm nosymbols*' --anyCategories Libraries Runtime " --category-exclusion-filter NoInterpreter NoMono --logBuildOutput --generateBinLog --corerun $RunDir\artifacts\dotnet-mono\shared\Microsoft.NETCore.App\8.0.0\corerun.exe --artifacts $RunDir\artifacts\BenchmarkDotNet.Artifacts --packages $RunDir\performance\artifacts\packages ```Run Information
Regressions in SOD - iOS HelloWorld Mono Zip Size llvm nosymbols
Test Report
Repro
General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md
Repro Steps
#### Prerequisites (Build files either built locally or downloaded from payload above) - Libraries build extracted to `runtime/artifacts` or build instructions: [Libraries README](https://github.com/dotnet/runtime/blob/main/docs/workflow/building/libraries/README.md) args: `-subset libs+libs.tests -rc release -configuration Release -arch $RunArch -framework net8.0` - CoreCLR product build extracted to `runtime/artifacts/bin/coreclr/$RunOS.$RunArch.Release`, build instructions: [CoreCLR README](https://github.com/dotnet/runtime/blob/main/docs/workflow/building/coreclr/README.md) args: `-subset clr+libs -rc release -configuration Release -arch $RunArch -framework net8.0` - Mono Runtime build extracted to `runtime/artifacts/bin/mono/$RunOS.$RunArch.Release`, build instructions: [MONO README](https://github.com/dotnet/runtime/blob/main/docs/workflow/building/mono/README.md) args: `-arch $RunArch -os $RunOS -s mono+libs+host+packs -c Release ` - Dotnet SDK installed for dotnet commands - Running commands from the runtime folder Linux ```cmd # Set $RunDir to the runtime directory RunDir=`pwd` # Set the OS, arch, and OSId RunOS='linux' RunOSId='linux' RunArch='x64' # Create mono dotnet mkdir -p $RunDir/artifacts/dotnet-mono $RunDir/build.sh -subset libs.pretest -configuration release -ci -arch $RunArch -testscope innerloop /p:RuntimeArtifactsPath=$RunDir/artifacts/bin/mono/$RunOS.$RunArch.Release /p:RuntimeFlavor=mono cp $RunDir/artifacts/bin/runtime/net8.0-$RunOS-Release-$RunArch/* $RunDir/artifacts/bin/testhost/net8.0-$RunOS-Release-$RunArch/shared/Microsoft.NETCore.App/8.0.0 -rf cp $RunDir/artifacts/bin/testhost/net8.0-$RunOS-Release-$RunArch/* $RunDir/artifacts/dotnet-mono -r cp $RunDir/artifacts/bin/coreclr/$RunOS.$RunArch.Release/corerun $RunDir/artifacts/dotnet-mono/shared/Microsoft.NETCore.App/8.0.0/corerun # Create Core Root $RunDir/src/tests/build.sh release $RunArch generatelayoutonly /p:LibrariesConfiguration=Release # Clone performance git clone --branch main --depth 1 --quiet https://github.com/dotnet/performance.git $RunDir/performance # One line run: python3 $RunDir/performance/scripts/benchmarks_ci.py --csproj $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --incremental no --architecture $RunArch -f net8.0 --filter 'SOD - iOS HelloWorld Mono Zip Size llvm nosymbols*' --bdn-artifacts $RunDir/artifacts/BenchmarkDotNet.Artifacts --bdn-arguments="--anyCategories Libraries Runtime --category-exclusion-filter NoInterpreter NoMono --logBuildOutput --generateBinLog --corerun $RunDir/artifacts/dotnet-mono/shared/Microsoft.NETCore.App/8.0.0/corerun" # Individual Commands: # Restore dotnet restore $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --packages $RunDir/performance/artifacts/packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1 # Build dotnet build $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore /p:NuGetPackageRoot=$RunDir/performance/artifacts/packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1 # Run dotnet run --project $RunDir/performance/src/benchmarks/micro/MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore --no-build -- --filter 'SOD - iOS HelloWorld Mono Zip Size llvm nosymbols*' --anyCategories Libraries Runtime " --category-exclusion-filter NoInterpreter NoMono --logBuildOutput --generateBinLog --corerun $RunDir/artifacts/dotnet-mono/shared/Microsoft.NETCore.App/8.0.0/corerun --artifacts $RunDir/artifacts/BenchmarkDotNet.Artifacts --packages $RunDir/performance/artifacts/packages ``` Windows ```cmd # Set $RunDir to the runtime directory $RunDir="FullPathHere" # Set the OS, arch, and OSId RunOS='windows' RunOSId='win' RunArch='x64' # Create mono dotnet mkdir -p $RunDir/artifacts/dotnet-mono $RunDir/build.sh -subset libs.pretest -configuration release -ci -arch $RunArch -testscope innerloop /p:RuntimeArtifactsPath=$RunDir\artifacts\bin\mono\$RunOS.$RunArch.Release /p:RuntimeFlavor=mono xcopy $RunDir\artifacts\bin\runtime\net8.0-$RunOS-Release-$RunArch\ $RunDir\artifacts\bin\testhost\net8.0-$RunOS-Release-$RunArch\shared\Microsoft.NETCore.App\8.0.0\ /e /y xcopy $RunDir\artifacts\bin\testhost\net8.0-$RunOS-Release-$RunArch\ $RunDir\artifacts\dotnet-mono\ /e /y xcopy $RunDir\artifacts\bin\coreclr\$RunOS.$RunArch.Release\corerun $RunDir\artifacts\dotnet-mono\shared\Microsoft.NETCore.App\8.0.0\corerun /y # Create Core Root $RunDir\src\tests\build.cmd release $RunArch generatelayoutonly /p:LibrariesConfiguration=Release # Clone performance git clone --branch main --depth 1 --quiet https://github.com/dotnet/performance.git $RunDir\performance # One line run: python3 $RunDir\performance\scripts\benchmarks_ci.py --csproj $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --incremental no --architecture $RunArch -f net8.0 --filter 'SOD - iOS HelloWorld Mono Zip Size llvm nosymbols*' --bdn-artifacts $RunDir\artifacts\BenchmarkDotNet.Artifacts --bdn-arguments="--anyCategories Libraries Runtime --category-exclusion-filter NoInterpreter NoMono --logBuildOutput --generateBinLog --corerun $RunDir\artifacts\dotnet-mono\shared\Microsoft.NETCore.App\8.0.0\corerun.exe" # Individual Commands: # Restore dotnet restore $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --packages $RunDir\performance\artifacts\packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1 # Build dotnet build $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore /p:NuGetPackageRoot=$RunDir\performance\artifacts\packages /p:UseSharedCompilation=false /p:BuildInParallel=false /m:1 # Run dotnet run --project $RunDir\performance\src\benchmarks\micro\MicroBenchmarks.csproj --configuration Release --framework net8.0 --no-restore --no-build -- --filter 'SOD - iOS HelloWorld Mono Zip Size llvm nosymbols*' --anyCategories Libraries Runtime " --category-exclusion-filter NoInterpreter NoMono --logBuildOutput --generateBinLog --corerun $RunDir\artifacts\dotnet-mono\shared\Microsoft.NETCore.App\8.0.0\corerun.exe --artifacts $RunDir\artifacts\BenchmarkDotNet.Artifacts --packages $RunDir\performance\artifacts\packages ```