Summary

Currently for Blazor WASM we only link the BCL assemblies, and keep all other assemblies. This gives a pretty good size reduction, but it's clear that we can go further. This issue tracks us doing more linking to reduce the size further, and balancing that with the impact on user-experience.

Challenges

There are two aspects of linker-friendliness that we talk about in general:

Correct by construction (app runs after linking)
Trimmability (unused code can be removed)

For the initial Blazor WASM release we are focused on correctness. We have some hard things to balance between trying to trim and trying to keep the user experience that people are successful with today.

Specifically in the area of correctness we've identified the following Blazor-specific patterns that cause challenges.

Routable components will be removed totally
Components will have their constructors/properties trimmed
JS Interop methods (from JS -> .NET) will be trimmed
JSON-serialized types (event args, JS Interop types) will have their constructors/properties trimmed
DI services will have their constructors trimmed

We also have to define precisely how we configure the linker, how we classify assemblies and what we apply to each group of assemblies. The linker also offers us two main choices for what to do per-assembly: we can "save" - which will copy the assembly if it's used, or we can "link" - which will do trimming.

There's an additional knob here which is to configure the preserve for for all of our types. This means that when trimming is used, it will be done at type granularity. So if type SomeBuiltInComponent is rooted, then all of the members of SomeBuiltInComponent will be preserved.

<?xml version="1.0" encoding="utf-8" ?>
<linker>
                <assembly fullname="BLAZORASSEMBLY">
                                <type fullname="*" preserve="all" required="false" />
                </assembly>
</linker>

I'd propose that think about this in terms of thesegroupings.

BCL assemblies (provided by Mono/CoreFx)
Blazor assemblies (what we ship)
Arbitrary assemblies (anything else, includes the user's app and other libraries)

Here's some brief justification about why these groupings are useful...

We have high expectations for the correctness of the BCL assemblies when the linker is used. In fact, we expect that the mono team have already done the work to make this function well. Therefore we can always link them aggressively, and we have done this up until now.

Next is the set of Blazor assemblies that we ship. Since these assemblies have components, DI services, and types used with JS Interop, we have to be able to mitigate all of the problems of linkability in order to turn it on. We can do this using an xml manifest like the one shown above (as long as we don't ship any routable components, and we currently don't). We have the flexibility to either list specific types with preserve="all" or to use a wildcard.

For arbitrary assemblies we can't really make any assumptions about what they contain. For instance they may contain components, and thus need XML manifests or special handling from the linker. We should not turn on linking for any assemblies that we don't know about, because it will very likely break them. We might want to see if there's a way to strip out crossgen data and embedded PDBs (if it's not done by default) for arbitrary assemblies. These things are significant contributors to size and aren't useful in production Blazor WASM apps. We don't expect too many assemblies in the wild to turn these features on, but it could happen for System. assemblies that come from packages.

Benefits

The benefit of this is quite large. Here's some data from putting together a quick prototype with the template. There are large portions of these assemblies that we're not using.

The number that matters the most is the "Transferred" values, which are post-compression.

Adding a few mitigations to get the app running again via the XML approach bumps the size transferred back up to 1.92mb. We should continue to monitor this number through a performance test as we iterate on the framework.

We've tested the difference between preserving all of our component/DI/JSInterop types with a hardcoded list, and using the preserve="all" strategy. The difference is significant when considering a basic app. Having a component like the EditForm, an auth component, or a validation component included when you don't need it has a big impact due to the dependency chain.

Proposal

Here's a concrete proposal, this is an MVP

Add XML descriptors for all of the Blazor assemblies (both core and extensions)
Turn on aggressive linking for all of the Blazor assemblies (via a hardcoded list)
Don't try to link assemblies we don't understand/ship

Improvements

These are improvements compared to the MVP that either make things easier to maintain, a better scale up story, or a better size-outcome. Most of these investments are contingent on mitigating the user-experience effects of more linking.

Idea 01: Add granularity to the linker

~~Removed all of this content - it turns out the linker already has this feature via preserve="all".~~

Idea 01.1: Document guidance for libraries

We'd like for libraries to also be able to configure their level of linking by doing what we're doing with preserve="all". This won't really work today because we're telling the linker to use the save action for those assemblies.

Idea 02: Add linker extensibility

In general it's possible to inject functionality into the linker. We could use this support to implement our own pattern recognition for JS interop methods, DI services, and components (and have prototyped some of this).

There are a few drawbacks to this, think of all of these as reasons why we'd like to make the linker more configurable rather than extend it:

Linker extensibility is complex, and hard to test
There's no built-in recipe for injecting this functionality (linker task doesn't support it)
This adds extra cost/passes to the linker which is already slow

If we have really hard problems to solve, or need to solve them widely (like DI), it might be a good option, however we should put the bar for justifying this pretty high.

rynowak commented 4 years ago

@SteveSandersonMS @javiercn @pranavkm

Added some info about the results of my investigation

SteveSandersonMS commented 4 years ago

Done in https://github.com/dotnet/aspnetcore/pull/18165

dotnet / aspnetcore

Blazor: link more assemblies #17022