dotnet / sdk

Core functionality needed to create .NET Core projects, that is shared between Visual Studio and CLI
https://dot.net/core
MIT License
2.7k stars 1.06k forks source link

Support extending which assemblies are trimmable #12035

Closed sbomer closed 4 years ago

sbomer commented 4 years ago

The current linker defaults make it difficult to modify the linker behavior to achieve "aggressive" trimming. We should introduce an MSBuild property to turn on aggressive trimming - that is, to pass all linker inputs with the action "link" instead of "copy" or "copyused". We need to consider how this interacts with existing mechanisms to modify linker behavior on individual assemblies.

It should be possible to enable different levels of trimming on a per-assembly basis as well, especially by different SDK components. The primary use case we would like to support immediately is blazor.

sbomer commented 4 years ago

Background

The SDK publish targets run ILLink, which has subtargets that process ResolvedFileToPublish. This list gets filtered down to the managed assemblies with PostProcessAssemblies == true, which are passed to the linker. Those with IsTrimmable != true by default are rooted and linked with the copy action, and the rest have the action determined by _TrimmerDefaultAction, which defaults to copyused.

It is worth reiterating that there are three conditions that influence the behavior:

  1. PostProcessAssemblies controls whether the linker will see the assembly at all
  2. IsTrimmable controls whether the linker will tree-shake the assembly (if not, it gets rooted, and gets action copy)
  3. the action (per-assembly Action metadata, or default) controls the level of tree-shaking

Blazor has their own linker targets that run during build, which filter assemblies by filename to determine the linker behavior. For .NET5, they are planning to move to the built-in linker targets that run during publish. They need a place to hook into the pipeline where they can continue filtering assemblies by name, and also a way to set the linker default behavior to aggressive trimming instead of assembly-level trimming (which is the .NET SDK default). In addition, they generate custom "type-granularity" roots for some assemblies, which will be done in the same place.

Proposal

TrimMode

To enable aggressive trimming instead of assembly-level trimming, we provide a public property TrimMode. Setting this tolink will change the default behavior from copyused to link (aggressive trimming) for assemblies that don't have per-assembly TrimMode. TrimMode can also be set as Item metadata to override the global property per-assembly.

PrepareForILLink

We will provide a new public target, PrepareForILLink that runs before ILLink target, and provides a convenient place to hook into the pipeline to modify metadata for trimming. SDK components can use this as an extension point via BeforeTargets and AfterTargets.

The global TrimMode may be set any time before PrepareForILLink runs, which will set it to a default value if not set previously.

ManagedAssemblyToLink

This target will have a dependency that creates the ItemGroup ManagedAssemblyToLink, which represents the set of assemblies that will be passed to the linker. Custom targets may modify IsTrimmable and TrimMode metadata on these assemblies before PrepareForILLink, which will set the assembly action based on this metadata, or they may modify the metadata after PrepareForILLink has run.

It will be illegal to change the items in ManagedAssemblyToLink, since this represents the set that needs to be filtered and replaced in the publish output. To change which assemblies are passed to the linker, a different extension point should be used to set PostProcessAssemblies metadata.

Examples

This shows how a developer could turn on aggressive trimming for framework assemblies (which are defined to be IsTrimmable by the SDK):

<PropertyGroup>
  <TrimMode>link</TrimMode>
</PropertyGroup>

This shows how Blazor (or a developer) could hook into the build to opt assemblies into different levels of trimming based on the filename:

<Target Name="PrepareForBlazorILLink"
        BeforeTargets="PrepareForILLink">
  <PropertyGroup>
    <!-- Set the default TrimMode for IsTrimmable assemblies -->
    <TrimMode>link</TrimMode>
  </PropertyGroup>
  <ItemGroup>
    <ManagedAssemblyToLink Condition="'$([System.String]::Copy('%(ManagedAssemblyToLink.Filename)').StartsWith('Microsoft.AspNetCore.'))">
      <!-- Trim these assemblies using the global TrimMode -->
      <IsTrimmable>true</IsTrimmable>
    </ManagedAssemblyToLink>
    <ManagedAssemblyToLink Condition="'$([System.String]::Copy('%(ManagedAssemblyToLink.Filename)').StartsWith('ThirdPartyAssembly.'))">
      <!-- Trim these assemblies with assembly-level trimming. Implies IsTrimmable. -->
      <TrimMode>copyused</TrimMode>
    </ManagedAssemblyToLink>
  </ItemGroup>
</Target>

Notes

IsTrimmable vs TrimMode

IsTrimmable exists in addition to TrimMode so that there can be a global default for assemblies without a per-assembly TrimMode. This lets the global property be used to set the mode for all IsTrimmable assemblies, and it lets individual assemblies be opted into trimming using the default mode set by the SDK for the target form factor.

Naming of TrimMode values

We have considered a few naming conventions for the TrimMode values:

For now, the proposal is to stay with the copyused/link terminology that is used by the tool itself. IsTrimmable allows opting into or out of trimming without referencing this terminology. If we add higher-level options to the linker in the future, we could expose those as new TrimMode values, or aliases for existing values.

Relation to existing options

TrimMode (global and per-assembly) will replace _TrimmerDefaultAction and the per-assembly Action metadata. PrepareForILLink and ManagedAssemblyToLink will replace the _SetILLinkDefaults target and the _ManagedAssembliesToLink ItemGroup.

Build vs Publish

The public properties and targets exposed in this design do not require modifying ResolvedFileToPublish or other MSBuild entities that are related to publish, leaving some room for us to potentially reuse targets if we ever need to run the linker during build instead of publish.

@eerhardt @pranavkm @marek-safar

marek-safar commented 4 years ago

/cc @akoeplinger

eerhardt commented 4 years ago

I think this proposal looks really good. Thanks for putting it together. I have a couple comments/questions:

  1. Custom targets may modify IsTrimmable and Action metadata on these assemblies

    • What's the difference between IsTrimmable and Action? Do we need both?
  2. The usage of the terms link and trim confuses me (this isn't just based on this proposal, we sort of have issues everywhere). The target is named ILLink, which respects a property named IsTrimmable. And then we have a property named TrimmerDefaultAction, which I can set to link.
pranavkm commented 4 years ago

Minor nit, The SDK typically names items in the singular e.g. ResolvedFileToPublish, ContentWithTargetPath etc. Consider naming the item group ManagedAssemblyToLink

sbomer commented 4 years ago

What's the difference between IsTrimmable and Action? Do we need both?

IsTrimmable was originally intended as a user-facing opt-in that deals less with the implementation details. Its current semantics aren't captured by Action alone since non-IsTrimmable assemblies are also rooted. (Action just represents the linker action).

The usage of the terms link and trim confuses me

Agreed. :( To explain how we got here: ILLink is used because it's the name of the tool (originally monolinker). PMs recommended to use "Trimmer" to describe the tool to the public and for the public-facing MSBuild options in 3.0. Now more of the extension points are reflecting naming conventions used by the tool - and we are even considering using the tool in an analysis mode, where the input assemblies, tentatively named ManagedAssemblyToLink in the proposal, would be analyzed without being trimmed.

If I were starting fresh, I might consider naming the project something like "IL Analyzer" or "IL Optimizer", and use "ManagedAssemblyToAnalyze", "AnalyzerAction: Trim", etc.

Consider naming the item group ManagedAssemblyToLink

Edited above, thanks!

sbomer commented 4 years ago

a property named TrimmerDefaultAction, which I can set to link

An alternative to TrimmerDefaultAction would be to define a different property like TrimmerGranularity (member/assembly) that maps to _TrimmerDefaultAction = link or copyused. Another idea is TrimmerBehavior (TrimMembers/TrimAssemblies), which we could extend to Analyze for a future analyze-but-don't-trim mode. Thoughts?

@vitek-karas @MichalStrehovsky

marek-safar commented 4 years ago

What about instead of TrimmerDefaultAction to have TrimmingMode with link, copy etc options. It could also be handy to allow the same attribute to be used on the project/assembly/packages references (can be done later).

vitek-karas commented 4 years ago

I don't like the term "link" - I think it has too much history which is about tools not related to our "linker" functionality (there is some overlap, but relatively small and definitely hard to explain). Not counting that when we introduce true AOT, there might actually be a true platform linker in the toolchain as well - which would create lot of confusion.

So in that sense I prefer anything else - "trim"/"trimmer" is at least unique enough. So going with that logic, I think we should name all public (and as much of internal ones as possible) properties/items with that term "trim".

The above proposal looks good with the exception of ManagedAssemblyToLink - it just doesn't fir with the rest. It's not ideal either, but ManagedAssemblyToTrim would at least fit with the rest well. Even if we later on tell the "Trimmer" to skip the assembly for example.

As for the future analyze-only scenarios... if we have to I would be OK with using "trim" there as well (after all the trim part is by far the most complex and likely heavily used). Having an action "analyze" which would apply as the default or per-assembly would make that work reasonably well.

If we could improve on that, then I would probably go with a completely separate set of properties/items - the example ManagedAssemblyToAnalyze sounds good. The tricky bit would be how to unify the two sets (trim and analyze) when both are used and we would use both to tell linker what to do.

One other thing to consider - rename the actions - specifically "link" is just bad (see the above reasons). Not only it uses terminology which is typically associated with a different action, but even in the linker it makes little sense - it actually means "trim" or "sweep" or... but not "link" (whatever that means). So ideally the values used by the msbuild properties/items would be from a different set which align the terms with the above discussion. Implementation wise we could either do the translation in the msbuild or ILLink task, or we could teach linker to accept multiple values for the same thing.

While we're at it, I think it's a good idea to design ahead and incorporate the "trim granularity" into it - so basically rename "copy" to "TrimAssembly" or similar which would then probably mean renaming "link" to "TrimMembers" - not bad. We don't have to support or even define all of those yet if we don't need to.

Nit: For actions and related settings I would not use the name of the tool, but rather the name of the action, so: TrimmerGranularity -> TrimGranularity. TrimmerDefaultAction is different as that talks about the tool - so that could stay like that.

vitek-karas commented 4 years ago

Adding @samsp-msft as the PM representative.

sbomer commented 4 years ago

Thanks for the feedback! How does the following sound?

My only concern is that Keep might be read by some as "root", which isn't what it means. Can anyone suggest a better name to replace copy?

I think it would make sense to do the translation in the task as long as there is a 1-to-1 mapping of these settings to AssemblyAction.

eerhardt commented 4 years ago

note that this set will include IsTrimmable != true assemblies

I still don't understand why we need IsTrimmable at all. Why can't we just have TrimMode, and if TrimMode isn't set, it defaults to Keep/copy?

sbomer commented 4 years ago

Why can't we just have TrimMode, and if TrimMode isn't set, it defaults to Keep/copy?

We also want non-IsTrimmable assemblies to be rooted, but I don't think an empty TrimMode should have those semantics. Rooting means something different from the assembly action - which is more like a decision about what to output.

There are two ways to set the TrimMode/action at the level of illink.dll: a global setting, or per-assembly settings that take precedence over the global setting. The addition of IsTrimmable also lets you use opt an assembly into trimming without explicitly specifying the action.

That's how I think about it at least - but maybe there is a cleaner way to do it that I'm not seeing.

marek-safar commented 4 years ago

While we're at it, I think it's a good idea to design ahead and incorporate the "trim granularity" into it - so basically rename "copy" to "TrimAssembly" or similar which would then probably mean renaming "link" to "TrimMembers" - not bad. We don't have to support or even define all of those yet if we don't need to.

I don't like this approach. I think we are just switching from one incorrect naming into another similarly incorrect naming. For illustration when someone sets TrimMode=TrimMembers you are actually instructing linker to remove resources, remove type-forwarders, drop interfaces, etc. Similarly for TrimMode=TrimAssembly, which is even more confusing.

It seems to me we should make the option more user friendly and hide the implementation details and configuration options. We could do that by defining different modes/levels. For example, we could have TrimMode with the following options (each of them with defined ILLinker behaviour)

and we could tweak their underlying setting as we make more stuff reliable.

vitek-karas commented 4 years ago

Had an offline discussion with @marek-safar: There should be an easy to use setting for app developers and it should not use any complex terminology.

This basically means that we should have one property which is the main knob for users to "play with". The TrimMode seems like a good candidate. Its values should be easy to understand - so the proposed None, Conservative, Aggressive, ... looks like a good start. SDK would then map that to the specific settings on the linker tool (enable/disable various options, set default action, maybe even change per-assembly actions, ...). Ideally the target specific SDK (Blazor in this case) would also have some relatively simple way to influence the mapping.

So I think the most important change is to introduce (or rename) the TrimMode property and define at least the 3 modes:

The next one is to define the per-assembly settings. I think the above proposal are OK - but I ended up agreeing with @marek-safar on not using the "TrimMembers"/"TrimAssembly" terminology. We could go with either the current names linker has for actions (copy, link, ...) or rename those. For simplicity we can probably keep the linker names - we can always rename them in the future (basically add aliases).

I would try to stick to the terminology separation between the capability => "trim" and the tool => "ILLinker".

sbomer commented 4 years ago

make the option more user friendly and hide the implementation details and configuration options we could tweak their underlying setting as we make more stuff reliable

I like the idea that these options should be user-friendly - though I would suggest that the configuration options we define should themselves be user-friendly so that we don't have to hide them. If we are going to be changing the meanings of the optimization levels over time, maybe they belong in the linker itself rather than in the MSBuild glue?

That said, the proposed TrimMode seem like they would map pretty directly to the existing settings, so maybe that's not necessary right now.

It sounds like the suggestion is to have TrimMode:

For both Conservative and Aggressive, !IsTrimmable would still mean root and copy.

Does that match your intent @marek-safar @vitek-karas ?

About None - I would avoid defining a new option that means the same thing as the well-established PublishTrimmed=false, which matches PublishReadyToRun and PublishSingleFile.

sbomer commented 4 years ago

I would try to stick to the terminology separation between the capability => "trim" and the tool => "ILLinker".

To me that suggests using ManagedAssemblyToLink since this represents the set that is processed by the tool - whether they are rooted, copied, analyzed, trimmed, etc. Otherwise we end up with ManagedAssemblyToTrim including !IsTrimmable assemblies.

marek-safar commented 4 years ago

maybe they belong in the linker itself rather than in the MSBuild glue

I think they should be defined in ILLink.Task

About None - I would avoid defining a new option that means the same thing as the well-established PublishTrimmed=false

None would mean that linker is still run but it marks everything which is not same as PublishTrimmed=false

samsp-msft commented 4 years ago

make the option more user friendly and hide the implementation details and configuration options

I don't like the Conservative vs Aggressive as it doesn't help the developer make an informed choice - our options should be clear about what level of shaking is involved. Its then much easier to reason about what the breakages and impact is.

My only concern is that Keep might be read by some as "root", which isn't what it means. Can anyone suggest a better name to replace copy?

DoNotTrim, DoNotModify, KeepAsIs

sbomer commented 4 years ago

None would mean that linker is still run but it marks everything

This is what !IsTrimmable means today.

@eerhardt was suggesting this as well - to fold IsTrimmable and the action into a single TrimMode - but we hit the following issue: How would we handle assemblies without TrimMode metadata?

We need a separate IsTrimmable that can be used to opt assemblies into trimming without specifying the TrimMode per-assembly (and makes it possible to use a global TrimMode to influence all IsTrimmable assemblies at the same time).

I've updated the proposal to use TrimMode (using the existing action names), added notes and examples to clarify the IsTrimmable behavior, and summarized the feedback about naming the options.

sbomer commented 4 years ago

Just to summarize the current state of the proposal:

eerhardt commented 4 years ago

TrimMode global property or per-assembly metadata: copyused/link

Were we going to use different names for these modes? Or are we sticking with the linker .exe names?

sbomer commented 4 years ago

I am suggesting we stick with the linker names - see the notes in "Naming of TrimMode values" above for a summary of the feedback I've received.

eerhardt commented 4 years ago

This proposal sounds good to me. I think it will address the short term needs for Blazor, and makes a good foundation for future improvements.

marek-safar commented 4 years ago

I agree. Let's get this implemented in time for the last preview to test it works for the scenario

eerhardt commented 4 years ago

Resolved with https://github.com/dotnet/sdk/pull/12116. Closing.

jzabroski commented 3 years ago

This is one of the most confusing specifications I have ever read from .NET :(

Why isn't there a TrimMode None? I am getting bug reports from users about System.Private.CoreLib System.Object type not found and I have no way to know if it's being caused by this trimming magic.

How do I set the behavior to be pre .NET Core App 3.0 days when nothing magical whatsoever was taking place? I don't want any brains or fancy algos. I just want everything to be linked and nothing to be removed.

What is the default - does it vary by SDK like BackGroundServices, etc. It sounds like Blazor SDK has a different default, which is a bit odd if you ask me that depending on where a library is executed, the behavior of reflected code is completely different and potentially nonsensical. @vitek-karas You hit the NAIL on the head with your remarks that developers need simple flags to understand this. I read two blog posts about this and I kept thinking, OK, you added a new feature, how do I turn it off or easily prove it's turned off and not affecting my binaries?

jzabroski commented 3 years ago

Besides this GitHub issue, These are the top Google Search results when I try to learn more about this stuff:

https://devblogs.microsoft.com/dotnet/app-trimming-in-net-5/ https://devblogs.microsoft.com/dotnet/customizing-trimming-in-net-core-5/

Just food for thought on where there may be some miscommunication. When an (optimization) feature gets added, it's a good idea to think through how to disable it as easily as possible. Probably a good rule of thumb for other types of features as well, but doubly because pre-mature optimization is the root of all evil. The other lesson learned here might be to think through how feature flags apply across SDKs, and create basic conformance tests for what an official SDK should and shouldn't do. Specialty SDKs like Blazor may even be best off in a unique SDK namespace to indicate they don't follow the same approach as others.

Just brainstorming.

sbomer commented 3 years ago

@jzabroski sorry this is causing you trouble. :( You may want to take a look at the user-facing documentation at https://aka.ms/dotnet-illink/. This design doc was meant for us working on the feature and for SDK authors, so it probably isn't the best place to learn about trimming.

To answer your questions:

MichalStrehovsky commented 3 years ago

@jzabroski I would suggest filing an issue describing your problem so that it can be visible to the people in charge of setting the trimming defaults for Blazor. Trimming is an explicit opt-in optimization in most .NET workloads because of its potential to break apps. Trimmer also generates warnings whenever it's not sure whether a particular line of code is going to work after trimming (reflection that is hard to reason about statically, typically).

The defaults are:

SDK owners can change these defaults. Blazor (and Xamarin) turns on trimming by default and turns warnings off. People on this pull request are not in charge of setting these defaults.