dotnet / linker

388 stars 126 forks source link

Out of memory on Github Actions when publishing an application for Android with trimming enabled #3126

Closed pekspro closed 7 months ago

pekspro commented 1 year ago

If I’m trying to publish a .NET 7 MAUI application, targeting Android, with trimming enabled on GitHub actions, it runs out of memory:

D:\a\RadioStormBuild\RadioStormBuild\Source\Pekspro.RadioStorm.Settings.SynchronizedSettings.FileProvider.Graph\bin\Release\net7.0\Pekspro.RadioStorm.Settings.SynchronizedSettings.FileProvider.Graph.dll
  Pekspro.RadioStorm.Settings.SynchronizedSettings.FileProvider -> D:\a\RadioStormBuild\RadioStormBuild\Source\Pekspro.RadioStorm.Settings.SynchronizedSettings.FileProvider\bin\Release\net7.0\Pekspro.RadioStorm.Settings.SynchronizedSettings.FileProvider.dll
  Optimizing assemblies for size may change the behavior of the app. Be sure to test after publishing. See: https://aka.ms/dotnet-illink
  Optimizing assemblies for size. This process might take a while.
  Optimizing assemblies for size may change the behavior of the app. Be sure to test after publishing. See: https://aka.ms/dotnet-illink
  Optimizing assemblies for size. This process might take a while.
ILLink : error IL1012: IL Trimmer has encountered an unexpected error. Please report the issue at https://github.com/dotnet/linker/issues [D:\a\RadioStormBuild\RadioStormBuild\Source\Pekspro.RadioStorm.MAUI\Pekspro.RadioStorm.MAUI.csproj::TargetFramework=net7.0-android]
  Fatal error in IL Linker
  Out of memory.
Error: C:\Users\runneradmin\AppData\Local\Microsoft\dotnet\sdk\7.0.100\Sdks\Microsoft.NET.ILLink.Tasks\build\Microsoft.NET.ILLink.targets(86,5): error NETSDK1144: Optimizing assemblies for size failed. Optimization can be disabled by setting the PublishTrimmed property to false. [D:\a\RadioStormBuild\RadioStormBuild\Source\Pekspro.RadioStorm.MAUI\Pekspro.RadioStorm.MAUI.csproj::TargetFramework=net7.0-android]
  Optimizing assemblies for size may change the behavior of the app. Be sure to test after publishing. See: https://aka.ms/dotnet-illink
  Optimizing assemblies for size. This process might take a while.
  Fatal error in IL Linker
ILLink : error IL1012: IL Trimmer has encountered an unexpected error. Please report the issue at https://github.com/dotnet/linker/issues [D:\a\RadioStormBuild\RadioStormBuild\Source\Pekspro.RadioStorm.MAUI\Pekspro.RadioStorm.MAUI.csproj::TargetFramework=net7.0-android]
  Out of memory.
Error: C:\Users\runneradmin\AppData\Local\Microsoft\dotnet\sdk\7.0.100\Sdks\Microsoft.NET.ILLink.Tasks\build\Microsoft.NET.ILLink.targets(86,5): error NETSDK1144: Optimizing assemblies for size failed. Optimization can be disabled by setting the PublishTrimmed property to false. [D:\a\RadioStormBuild\RadioStormBuild\Source\Pekspro.RadioStorm.MAUI\Pekspro.RadioStorm.MAUI.csproj::TargetFramework=net7.0-android]
  Optimizing assemblies for size may change the behavior of the app. Be sure to test after publishing. See: https://aka.ms/dotnet-illink
  Optimizing assemblies for size. This process might take a while.
Error: Process completed with exit code 1.

I’m using the default Windows runner that has 7 GB of memory: https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners

With trimming disabled, it works fine. I could publish with trimming enabled on my local computer that has a lot more memory.

The source code for the application is available here: https://github.com/pekspro/RadioStorm

I also have an older version of the application targeting .NET 6. I have been able to trim this in GitHub without running out of memory.

It’s not a trivial app, but also not a very large one, I think. I just wanted to report this. I do not expect any solution for this :-)

marek-safar commented 1 year ago

Looks similar to #3119 thought 7GB is a lot.

/cc @vitek-karas

akoeplinger commented 1 year ago

Might be worth trying to pass /maxcpucount:1 to dotnet build as a workaround to make sure it only tries to trim one app at a time.

pekspro commented 1 year ago

Thanks @akoeplinger, that is an interesting idea. I'm trying that right now.

pekspro commented 1 year ago

/maxcpucount:1 didn't help. It run for about 90 minutes and then it run out of memory.

vitek-karas commented 1 year ago

If it ran for 90 minutes "in the linker" (or anywhere in msbuild) then this is definitely a bug. I'll look into this tomorrow (sorry, too late today).

pekspro commented 1 year ago

I tried /maxcpucount:1 in .NET 6 as well. Took about 20 minutes in Github Actions. Without this parameter, I think it is about 15 minutes.

vitek-karas commented 1 year ago

Tried it locally - it is really bad in 7.0:

Publishing the Android project spins up 4 linkers each of which takes 25 minutes to run on my machine (which is reasonably fast, so this is really bad) and they each consume 7-10 GB of memory (it oscillates in this range, so my guess is that it needs around 7GB, the rest is GC doing its thing).

When I reran only one of the illink invocations it is a bit faster - around 17 minutes, but still really slow. And the memory consumption was as high as 11 GB (but there was less memory pressure on the system, so GC didn't need to work as hard).

Using latest linker from main (8.0) it is much better - I reran one of the illink invocations and it took only 1 minute and used a little bit over 4 GB of memory. So it might still not work in 7GB limited environment if running 4 at once, but it's definitely a LOT better.

This should be fixed by https://github.com/dotnet/linker/pull/3094 once it's merged into 7.0 and shipped. I tried locally with a build from https://github.com/dotnet/linker/pull/3094 and it took less than 1 minute and consumed max 4.5 GB of memory.

marek-safar commented 1 year ago

Using latest linker from main (8.0) it is much better - I reran one of the illink invocations and it took only 1 minute and used a little bit over 4 GB of memory.

That's still quite a lot, any pointers into what is taking that much?

vitek-karas commented 1 year ago

I havened looked at the memory consumption details. I'll leave that to @jtschuster (I'll send you the simple repro offline).

jtschuster commented 1 year ago

When running the repro on my machine, the process only ever got up to about 3 GB, but I was using a different version of the Android sdk.

It looks like Cecil Instructions, the MarkStep._methods queue, and ParameterDefinitions take up the most space in our heap.

Type Size (mb)
Instructions 300
_methods (Queue<ValueTuple<MethodDefinition, DependencyInfo, MessageOrigin>>) 110
ParameterDefinition 100
MethodDefinition 80
GenericInstanceType 60
MethodReturnType 50
ParameterDefinitionCollection 44
MethodReference 42
marek-safar commented 1 year ago

It looks like Cecil Instructions, the MarkStep._methods queue, and ParameterDefinitions take up the most space in our heap.

It could be also useful to check if we could "free" some memory earlier.

vitek-karas commented 1 year ago

Another idea is to process the methods in different order. If you imagine how typical programs work:

Each level is bigger in size (number of different methods) than the one above it, so it looks like a tree. There will be direct calls to low-level methods spread around there as well, but they're not the ones which hurt. Since we use a simple queue, this will effectively do a breath first walk of the tree - where it basically keeps the entire lower level of the tree in the queue before it gets to it (probably more than just one actually).

Another way to look at it:

Our current algorithm effectively prioritizes processing high-level methods over the low-level ones (see the tree description above). This makes the queue long.

I prototyped this real quick by changing the Queue to Stack (we could probably do better than that still) and for hello world the max size of the collection were:

I would expect the effect to get bigger for larger apps.

vitek-karas commented 1 year ago

Another thing to check would be if we keep MarkStep around after it's done. In theory the driver should free steps which are processed, but it's a complex system, so hard to tell.

jtschuster commented 1 year ago

I would expect the effect to get bigger for larger apps.

Unfortunately, with #3139, _methods is about the same size as a queue or stack in the repro for this issue (~14mb).

jbe2277 commented 8 months ago

I had seen this issue with my app NewsReader when I was using MAUI 7 (.NET 7).

After upgrading to MAUI 8 (.NET 8) I have re-checked this issue again - now the GitHub Actions for Android in Release mode are working again.

Note: It is still much slower than the build for iOS or Windows (WinUI 3).

pekspro commented 7 months ago

I can confirm this is no longer an issue in .NET 8. Closing this.