[Perf] NAOT iOS HelloWorld size: 4 Regressions on 5/15/2024 8:03:38 AM

performanceautofiler[bot] commented 5 months ago

Run Information

Name	Value
Architecture	x64
OS	Mac OS X 10.18
Queue	IPhone
Baseline	3de068c2be232960f4cb3b233dd2b4014e2ce7c3
Compare	714a4420805ed53c311b05381c83c88894100fa9
Diff	Diff
Configs	CompilationMode:tiered, HybridGlobalization:true, iOSStripSymbols:true, RunKind:ios_scenarios, RuntimeType:nativeaot

Regressions in SOD - iOS HelloWorld NativeAOT .app Size nollvm nosymbols

Benchmark	Baseline	Test	Test/Base	Edge Detector
[app Size nollvm nosymbols - Aggregate - No Extension](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Mac OS X 10.18_HybridGlobalization=true_iOSStripSymbols=true_RunKind=ios_scenarios_RuntimeType=nativeaot/SOD%20-%20iOS%20HelloWorld%20NativeAOT%20.app%20Size%20nollvm%20nosymbols.html>) ADX - Test Multi Config Graph	2.76 MB	2.89 MB	1.05	True
[app Size nollvm nosymbols - SOD - iOS HelloWorld NativeAOT .app Size nollvm nosymbols](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Mac OS X 10.18_HybridGlobalization=true_iOSStripSymbols=true_RunKind=ios_scenarios_RuntimeType=nativeaot/SOD%20-%20iOS%20HelloWorld%20NativeAOT%20.app%20Size%20nollvm%20nosymbols.html>) ADX - Test Multi Config Graph	2.76 MB	2.89 MB	1.05	True
[app Size nollvm nosymbols - pub](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Mac OS X 10.18_HybridGlobalization=true_iOSStripSymbols=true_RunKind=ios_scenarios_RuntimeType=nativeaot/SOD%20-%20iOS%20HelloWorld%20NativeAOT%20.app%20Size%20nollvm%20nosymbols.html>) ADX - Test Multi Config Graph	2.76 MB	2.89 MB	1.05	True
[app Size nollvm nosymbols - pub/iosHelloWorld/HelloiOS.app/HelloiOS](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Mac OS X 10.18_HybridGlobalization=true_iOSStripSymbols=true_RunKind=ios_scenarios_RuntimeType=nativeaot/SOD%20-%20iOS%20HelloWorld%20NativeAOT%20.app%20Size%20nollvm%20nosymbols.html>) ADX - Test Multi Config Graph	2.76 MB	2.89 MB	1.05	True

graph graph graph graph graph graph graph graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'SOD - iOS HelloWorld NativeAOT .app Size nollvm nosymbols*'

### SOD - iOS HelloWorld NativeAOT .app Size nollvm nosymbols #### ETL Files #### Histogram #### JIT Disasms ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

matouskozak commented 5 months ago

The range for regression is https://github.com/dotnet/runtime/compare/778cc8436260736c1ca5393abb27414ea42ad8a7...a71a85a65f6527ca2754ccf336f1fc77f02c409b. Could it be https://github.com/dotnet/runtime/pull/102183 @MichalStrehovsky? ... but seems unlikely that adding assembly name would cause 5% regression in size...

MichalStrehovsky commented 5 months ago

The range for regression is dotnet/runtime@778cc84...a71a85a. Could it be dotnet/runtime#102183 @MichalStrehovsky? ... but seems unlikely that adding assembly name would cause 5% regression in size...

If this is measuring size with symbols, it would be from that. Symbols should be stripped before measuring size though.

matouskozak commented 5 months ago

The range for regression is dotnet/runtime@778cc84...a71a85a. Could it be dotnet/runtime#102183 @MichalStrehovsky? ... but seems unlikely that adding assembly name would cause 5% regression in size...

If this is measuring size with symbols, it would be from that. Symbols should be stripped before measuring size though.

This one should be without symbols (nosymbols in the issue header). With symbols we also saw regression of 6% (slightly bigger than this one) https://github.com/dotnet/perf-autofiling-issues/issues/34369.

MichalStrehovsky commented 5 months ago

I've kicked off a size validation job in https://github.com/MichalStrehovsky/rt-sz/issues/22 (results will be there in ~20 minutes) just to be sure, but that change should only affect symbols (and if it's from that, we should have seen an improvement in January when the regression my PR is fixing was introduced).

matouskozak commented 5 months ago

I've kicked off a size validation job in MichalStrehovsky/rt-sz#22 (results will be there in ~20 minutes) just to be sure, but that change should only affect symbols (and if it's from that, we should have seen an improvement in January when the regression my PR is fixing was introduced).

Looks like the size validation didn't show any regression. I wonder if it could be something mobile + NAOT specific as it doesn't show in you validation job yet the range doesn't contain anything else related to NAOT.

MichalStrehovsky commented 5 months ago

I've kicked off a size validation job in MichalStrehovsky/rt-sz#22 (results will be there in ~20 minutes) just to be sure, but that change should only affect symbols (and if it's from that, we should have seen an improvement in January when the regression my PR is fixing was introduced).

Looks like the size validation didn't show any regression. I wonder if it could be something mobile + NAOT specific as it doesn't show in you validation job yet the range doesn't contain anything else related to NAOT.

This should really only affect mangled names and mangled names are only used for symbols. Is it possible to get the binary that this is measuring? Just the after binary should be enough to rule out bad symbol stripping configuration.

matouskozak commented 5 months ago

@LoopedBard3 do you think it is possible to get the binaries? I found the Perf build https://dev.azure.com/dnceng/internal/_build/results?buildId=2451389&view=logs&j=4f320d25-c1ea-5d1a-2cd7-ab8121a51286 but the Artifacts only contains binlogs.

dotnet-policy-service[bot] commented 4 months ago

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas See info in area-owners.md if you want to be subscribed.

matouskozak commented 4 months ago

fyi: @ivanpovazan

MichalStrehovsky commented 4 months ago

@LoopedBard3 do you think it is possible to get the binaries? I found the Perf build https://dev.azure.com/dnceng/internal/_build/results?buildId=2451389&view=logs&j=4f320d25-c1ea-5d1a-2cd7-ab8121a51286 but the Artifacts only contains binlogs.

I think the binaries are right there:

Click the "Download iOS Sample App" step.

Downloading artifact iOSSampleAppNoSymbolsHybridGlobalizationTrue from: https://dev.azure.com/dnceng//_apis/resources/Containers/17496355?itemPath=iOSSampleAppNoSymbolsHybridGlobalizationTrue&isShallow=true&api-version=4.1-preview.4

Click that, follow the JSON to https://dev.azure.com/dnceng/_apis/resources/Containers/17496355?itemPath=iOSSampleAppNoSymbolsHybridGlobalizationTrue%2FiOSSampleAppNoSymbolsHybridGlobalizationTrue.zip.

And sure enough, the binary is not stripped properly. There should not be any S_P_CoreLib strings in a stripped binary, but there's lots of them. This is from https://github.com/dotnet/runtime/pull/102183, but it's a benchmark setup issue; stripping doesn't do the right thing.

ivanpovazan commented 4 months ago

And sure enough, the binary is not stripped properly. There should not be any S_P_CoreLib strings in a stripped binary, but there's lots of them. This is from https://github.com/dotnet/runtime/pull/102183, but it's a benchmark setup issue; stripping doesn't do the right thing.

Thanks Michal for pointing to the root cause! We are applying the same stripping technique for both NativeAOT and MonoAOT to have "fair" comparison stripping only the local symbols (via: strip -x https://github.com/dotnet/runtime/blob/2d2d59c16193fd1422572455b44fc5669f139ffe/src/tasks/AppleAppBuilder/Xcode.cs#L679 ) - this is a limitation with Mono (we have a tracking issue to improve this: https://github.com/dotnet/runtime/issues/81530 so we can do full stripping).

That being said, in cases like this, we should check if the MAUI app also regressed (MAUI apps use full stripping):

if it did, then we most probably have an issue which we should investigate further
if it didn't, we should consider the regression with a grain of salt - limitation of how benchmark is setup

matouskozak commented 4 months ago

I took a look at our MAUI measurements and the regression for the range (https://github.com/dotnet/runtime/compare/84b33395057737db3ea342a5151feb6b90c1b6f6...de6897b3b71c1628329ab586ea3e4ecc6a5a2ab3) containing this change is minimal (0.014 MB) + the MAUI version also changed https://github.com/dotnet/maui/compare/562713c8092cd1c0384964f3f32278ecd4a8aba5...13942bdd52f95ad60cf98442c031d6d2da3ced14 so it is likely not even related.

ivanpovazan commented 4 months ago

@matouskozak in that case I would suggest then closing this issue and link the problem we experienced here to https://github.com/dotnet/runtime/issues/81530

dotnet / runtime