Open keegan-caruso opened 2 days ago
A few questions:
The size of the Linq Namespace measured by sizoscope and when compiled with Native Aot: N Size .NET 9.0 Size with Feature Flag Enabled 10 1016.6 kb 77.7 kb 25 5.2 mb 190.4 kb 50 19.7 mb 378.3 kb 75 43.4 mb 566.2 kb 100 76.3 mb 754.1 kb 1000 Failed to compile 7475.2 kb
@MichalStrehovsky could ILCompiler optimize this pattern for value type a bit broadly (or even conservatively for .Select()
verbatim) without introducing the feature switch?
A few questions:
- Who do we expect to set this to false?
- This is specific to Select because it's the only one of the virtuals on the base Iterator type that's generic? There have separately been concerns about the size impact of all of these specializations, GVM or not.
- We already have the OptimizeForSize build constant: https://github.com/dotnet/runtime/pull/109978/files#diff-bcb77d7db7721bb5508d93dc432d9a40d920e6b248c98c9ad14a3640bbe6fa2bR11. I'm not a huge fan of having yet another flavor. Could we get rid of that existing one, so that there's just one build of LINQ, and then use a switch like this to control things at publish / execution time?
1: It is an opt out of the feature if the performance is unacceptable and the developer is unwilling to rewrite their code to avoid Linq Select. If they are unwilling to make the size vs perf tradeoff for this scenario.
Yes, only virtual that is also a generic method. I updated the description to add an example of the size difference we see with this change; it can be significant.
I guess it is a question if we can conditionally set the value of the feature flag from TargetPlatformIdentifier and if it should be controllable at publish time.
I guess it is a question if we can conditionally set the value of the feature flag from TargetPlatformIdentifier and if it should be controllable at publish time.
I'm suggesting we wouldn't gate it on TPM, and instead guard everything relevant with the feature switch, which when set would cause lots of the specializations to be trimmed away.
I guess it is a question if we can conditionally set the value of the feature flag from TargetPlatformIdentifier and if it should be controllable at publish time.
I'm suggesting we wouldn't gate it on TPM, and instead guard everything relevant with the feature switch, which when set would cause lots of the specializations to be trimmed away.
I only very briefly spot checked what code is in the SpeedOpt files but if it's stuff like this:
It would introduce untrimmable methods into compilation - these are all virtual/interface methods. Even if we stub them out, this feels like it's going to be a size regression since we need the method bodies, even if they pretty much do nothing.
@MichalStrehovsky could ILCompiler optimize this pattern for value type a bit broadly (or even conservatively for
.Select()
verbatim) without introducing the feature switch?
Do you have a more concrete idea? The change in this PR modifies behaviors, the resulting behavior is not identical; we take different codepaths and call different methods. It would require some very advanced analysis to do the equivalent change in the compiler.
Size statistics from rt-sz:
Nice savings for Avalonia. The rest probably already learned the lesson to just steer clear of LINQ if perf is of any concern.
Pull request dotnet/runtime#109978
Project | Size before | Size after | Difference |
---|---|---|---|
avalonia.app-linux | 22175368 | 20849944 | -1325424 |
avalonia.app-windows | 19117568 | 18252800 | -864768 |
hello-linux | 1352352 | 1352352 | 0 |
hello-minimal-linux | 1081896 | 1081896 | 0 |
hello-minimal-windows | 858112 | 858112 | 0 |
hello-windows | 1103360 | 1103360 | 0 |
kestrel-minimal-linux | 5474240 | 5474240 | 0 |
kestrel-minimal-windows | 4908544 | 4909056 | 512 |
reflection-linux | 2063440 | 2063440 | 0 |
reflection-windows | 1750016 | 1750016 | 0 |
webapiaot-linux | 10120480 | 10120480 | 0 |
webapiaot-windows | 9157632 | 9158144 | 512 |
winrt-component-full-windows | 5602304 | 5600768 | -1536 |
winrt-component-minimal-windows | 1747456 | 1747456 | 0 |
this feels like it's going to be a size regression since we need the method bodies, even if they pretty much do nothing.
It'd mean smaller code for nativeaot / trimmed coreclr apps in exchange for possibly slightly larger for mobile. Plus not having to maintain two biuild flavors of the library, and now another variation on top of it. I'm not excited at having both the build constant and the trimming constant, both to optimize for size, both with similar goals, but both doing it differently.
This enables it if PublishTrimmed is true, I wonder if it should set defaults based on PublishAot instead.
This is on by default if trimming is enabled?
I don't see it mentioned anywhere, but these changes still trade offf throughput and allocation for that size benefit. Some of the ones covered by this PR have been there since the earliest days of LINQ.
Also, this can have observable behavioral differences, which I thought we tried to avoid as part of trimming by default.
Also, this can have observable behavioral differences, which I thought we tried to avoid as part of trimming by default.
Is this behavior or just perf difference? I'm not thrilled about introducing perf differences either, but the generic expansion caused by this generic virtual method can lead to actual failure to compile (#102131 mentioned in the top post) because it becomes physically impossible to compile that much code.
Brainstorming alternatives, we could also add a perf analyzer that simply flags all uses of LINQ in code that sets IsAotCompatible
as a perf problem so that people know to stay away from it. It's basically how we solved these issues in the past, but without the analyzer, just deleting LINQ use in e.g. ASP.NET.
Looking at the test results, looks like this change is not correct in this shape, the tests are hitting a stack overflow.
Brainstorming alternatives, we could also add a perf analyzer that simply flags all uses of LINQ in code that sets
IsAotCompatible
as a perf problem so that people know to stay away from it. It's basically how we solved these issues in the past, but without the analyzer, just deleting LINQ use in e.g. ASP.NET.
Maybe a perf analyzer wouldn't be the worst idea in general. LINQ expressions is another thing that performs very differently under AOT and we could use something that would steer people away from it better than a line in native AOT docs.
Do you have a more concrete idea? The change in this PR modifies behaviors, the resulting behavior is not identical; we take different codepaths and call different methods. It would require some very advanced analysis to do the equivalent change in the compiler.
Maybe a perf analyzer wouldn't be the worst idea in general. LINQ expressions is another thing that performs very differently under AOT and we could use something that would steer people away from it better than a line in native AOT docs.
IlcFoldIdenticalMethodBodies=true
appears to remove about 260 KB from LINQ namespace for N = 10.
I wanted to ask if something could be done to fold codegen-identical type instantiations besides removing optimized iterator implementations. This is not the first report here that concerns a problematic interaction of LINQ + enums with NAOT.
Is this behavior or just perf difference?
Behavior. It's generally minor, but for example if you have:
IList<T> list = ...;
foreach (var item in list.Skip(3).Take(4).Select(...) { ... }
that Select will end up producing an enumerable that will use the IList<T>
's indexer, but with this PR, it would end up enumerating it via GetEnumerator/MoveNext/Current.
IlcFoldIdenticalMethodBodies=true
appears to remove about 260 KB from LINQ namespace for N = 10.
Behavior. It's generally minor, but for example if you have:
IList<T> list = ...; foreach (var item in list.Skip(3).Take(4).Select(...) { ... }
I see that there's already an ifdef that controls some of the behavior here. It looks like that ifdef may already be enabled for some of the mobile platforms and WASM. Is this a known behavioral difference?
Also, just to provide some clarity for @keegan-caruso, I think we should definitely have a runtime feature-flag for this behavior. Whether or not it's on by default when running AOT can be a separate decision, but there seems to be a wealth of evidence that this is a potentially blocking implementation for some people and we should provide a way to workaround the problem for large apps.
Adds a feature flag to allow Linq Select to not use a GVM implementation.
The compiled size in Native AOT with a value type GVM scales at an order of n**2 relative to types used in the call. This avoids that growth in size at the cost of a slower implementation of chained Linq calls.
A real-world example of where this caused an inability to compile was in this issue: https://github.com/dotnet/runtime/issues/102131
Adapting this to something a bit contrived, but easy to measure:
The size of the Linq Namespace measured by sizoscope and when compiled with Native Aot: