dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.26k stars 4.73k forks source link

Proposal: Move ILLink custom steps out of the trimmer process #107211

Open jtschuster opened 2 months ago

jtschuster commented 2 months ago

Goals

Non-Goals

Current Custom Steps

Searching for the _TrimmerCustomSteps msbuild property in github search, I found the following repositories that use custom steps:

A short summary of each step is below (please update or correct if inaccurate or missing info):

xamarin-macios

All steps inherit from ConfigurationAwareStep, ConfigurationAwareSubStep, or ConfigurationAwareMarkHandler, which provide a config, DerivedLinkContext object, and Application object.

Before MarkStep

IMarkHandlers which run during Mark

MarkSubStepDispatcher steps: Only need to run on marked assemblies

Pre-sweep custom steps

Post-sweep custom steps

Pre-output custom steps

Post-output steps

android

Before MarkStep

During MarkStep

After MarkStep

After CleanStep

emclient/MailClient.Linker

Summary of Custom Step requirements and potential replacements

Edit Metadata

This can be done with separate tools that run before or after ILLink.

Unconditionally mark metadata

This could be replaced with a generated ILLink.Descriptors.xml.

Conditionally mark / create dynamic dependencies

Basic dependencies could be implemented with DynamicDependencyAttributes in a generated ILLink.LinkAttributes.xml. Dependencies that are conditional on more than one metadatum would not work with this system. This also will require that the depending item can have a CustomAttribute applied to it. This may not be an issue for the current set of custom steps.

Optimizations to avoid processing unmarked metadata

The primary use of IsMarked and MarkHandler steps is primarily to reduce the cost of processing unnecessary metadata. While adding DynamicDependency's could be used to achieve the same effect, this would likely require more processing time.

Save data for msbuild, and pass data between steps

Some steps must save information from before trimming to be used after trimming. Some info is required for platform linker or other subsequent build steps.

Some steps save references to metadata objects before MarkStep and then retrieve those objects after. This would not be possible if they are run in separate processes. In particular, CollectUnmarkedMembersSubstep in xamarin-macios would cause issues.

Proposal

Custom steps could be mostly left as they are but run either entirely before or entirely after ILLink. A new custom step host and msbuild task would be created to load and run the custom steps before and after ILLink. IMarkHandler steps would run after the trimmer with the assumption that all metadata in the assembly after ILLink is marked.

A custom step that runs before the trimmer:

A custom step that runs after ILLink:

A custom step host project would exist in each of the repos where they are used to allow custom behavior for each platform and decouple custom steps from the trimmer.

Considerations

The primary incompatibility with this approach is dependencies that cannot be expressed with DynamicDependencyAttributes. For example, if a dependency is required if and only if two or more types are marked. It's not clear how prevalent these kinds of dependencies are in the custom steps.

DynamicDependency is also treated as a reflection dependency in ILCompiler, which may lead to complications and unintended affects (for example, warnings if the depended-on member is annotated with DAM).

This would require loading and writing the assemblies two more times than currently. A good proxy for this is the time spent in the Mono.Cecil dll. Testing on my laptop showed illink spends about 12.05% of its time in Mono.Cecil when linking a sample Maui android app, and 31% of its time in Mono.Cecil when trimming a hello world app (I haven't been able to trim an IOS or Mac app without a Mac).

Creating unconditional dependencies and dynamic dependencies is a model very similar to the DependencyAnalysis in ILCompiler. If the custom steps are able to declare their dependencies in this way and the trimmer is transitioned to the DependencyAnalysisFramework, the trimmer could offer a different method for declaring these DynamicDependencies that integrates well with the DependencyAnalysisFramework.

The custom steps may also be a great place to test transitioning from Mono.Cecil to System.Reflection.Metadata. Since the custom steps themselves and the tool that runs them would likely be smaller than the trimmer, we could try transitioning the custom steps first to find the pain points before transitioning the trimmer. This would also alleviate the performance impact of loading the assemblies multiple times.

dotnet-policy-service[bot] commented 2 months ago

Tagging subscribers to 'linkable-framework': @eerhardt, @vitek-karas, @LakshanF, @sbomer, @joperezr, @marek-safar See info in area-owners.md if you want to be subscribed.

dotnet-policy-service[bot] commented 2 months ago

Tagging subscribers to this area: @agocke, @sbomer, @vitek-karas See info in area-owners.md if you want to be subscribed.

vitek-karas commented 2 months ago

Thanks a lot for the great writeup @jtschuster !

I'll think on this some more, just writing down stuff I'm thinking right now:

One major negative is that this doesn't really solve the problem for NativeAOT - for example right now, we have to run the trimmer for NativeAOT iOS targets - where we run the trimmer (with some modifications) to basically get the custom steps and their output and then feed the result to the NativeAOT compiler - which then runs another trimming. The above proposal would not fix this problem. but the only solution to that problem is probably to rewrite the interops such that we don't need the custom steps. So following that logic, an alternative approach would be to invest the resources into changing the interop generation for mobile targets to get rid of custom steps completely (or turn them into purely pre-processors, which would be compatible with NativeAOT). Ideally anything we do moves us closer to a future place where trimming and NativeAOT use the same solution (and we don't have to run trimmer and NativeAOT in series).

@ivanpovazan, @simonrozsival .

rolfbjarne commented 1 month ago

For xamarin-macios we already have an issue to remove the custom linker steps, so it's on our todo list: https://github.com/xamarin/xamarin-macios/issues/17693

Fixing it completely is a major task though, so while I'm fairly certain we could get started on the list of steps to move out of the ILLink, getting them all done (by ourselves at least) may take a while (it would be dicey in the .NET 10 timeline).

So following that logic, an alternative approach would be to invest the resources into changing the interop generation for mobile targets to get rid of custom steps completely (or turn them into purely pre-processors, which would be compatible with NativeAOT).

One significant problem with NativeAOT is that we need to inspect the linked output (to create a list of Objective-C classes and other native entry points the linked code needs), and NativeAOT's output isn't very inspectable. There are additional problems as well (we currently create a table to map managed classes to Objective-C classes, but if we pre-generate that table, we'll end up preserving every managed class, and we can't post-generate it because it's a managed table - although @simonrozsival has some initial ideas on that here.

Ideally anything we do moves us closer to a future place where trimming and NativeAOT use the same solution (and we don't have to run trimmer and NativeAOT in series).

I totally agree with this.

jkotas commented 1 month ago

we don't have to run trimmer and NativeAOT in series

We may want to consider running the trimmer in analyze-only mode that just produces information about the surviving user and 3rd party library types. The result of this analysis would feed into the post-build interop generator(s). It is how this problem was solved in .NET Native for UWP. It is a common problem that interop solutions need to solve to minimize binary sizes and work done during the application startup.

jtschuster commented 1 month ago

For now, my primary focus is to move the custom steps to pre- and post-processors before trying to find a preprocessor-only solution that works with both nativeaot and trimmer (though I think this effort would help us get there). I am happy to help work on this.

Regarding performance, while there may be some drawbacks to running the steps outside of the trimmer, we do have the potential for parallelizing the work and making it incremental for the steps that lend themselves to it.

vitek-karas commented 1 month ago

I think we should start with the steps which can be moved before the trimmer - the main reason is that this applies to both trimmer and NativeAOT (same steps can be run before NativeAOT with hopefully the same effects). Additionally these are likely to remain close to their behavior even longer term.

The steps which would run after trimmer are the short-term solution, as longer term we need to figure out some solution which will work with NativeAOT as well.

I also think that ideally we would focus this work on the Debug build scenarios - especially on Android the debug build doesn't trim anything (we pass a special parameter to trimmer to effectively root everything) and it only runs trimmer for the custom steps. In this case we should be able to avoid running trimmer altogether - and hopefully get some perf wins.

More medium term - if we do create a pre-trimmer tool which can modify IL and is based on System.Reflection.Metadata (and not Cecil) it would be very useful also for ILStrip - ILStrip is a tool which runs in some AOT enabled configurations in Mono and it removes method bodies (and potentially other things) for AOTed methods. Currently it's based on a very old source copy of Cecil and will cause more and more problems going forward. It can't be done on top of public Cecil as it needs to preserve token values.

rolfbjarne commented 1 month ago

A few random thoughts:

Some steps save references to metadata objects before MarkStep and then retrieve those objects after. This would not be possible if they are run in separate processes. In particular, CollectUnmarkedMembersSubstep in xamarin-macios would cause issues.

One idea here would be for the post-trimmer code to take both the untrimmed assemblies and the trimmed assemblies as input, and be able to compare them / inspect both.

The main problem with this approach is that NativeAOT doesn't produce a trimmed assembly.

emclient/MailClient.Linker

  • Removes ObjCRuntime.ThrowHelper.ThrowArgumentNullException(string) from beginning of ReleaseBlockWhenDelegateIsCollected

Looks like that was a workaround for https://github.com/xamarin/xamarin-macios/issues/20920, so it shouldn't be necessary anymore.

  • BackingFieldDelayHandler L / Unknown
    • Could this be a feature added to the linker itself? Or could a write-only field have side effects even if it's never read?

For xamarin-macios, this might be a good place to start, because if it could be implemented in the trimmer(s) themselves, it would be possible to just remove the code in xamarin-macios.

I think you got the idea slightly wrong though, a general implementation would be to:

The point is to completely remove code like this:

MyDisposableType? obj;
void Dispose ()
{
    obj?.Dispose ();
    obj = null;
}

The main size savings we're looking for come from being able to trim away the MyDisposableType after this optimization, when it's not used anywhere else (which is quite common).