Open baronfel opened 3 months ago
There are two overlapping scenarios for RID-specific builds:
I'm guessing that the former is much more common. We should ensure that both are validated.
I'm assuming that ReadyToRun is just an example. Native AOT should be similar. In fact, just generating a single file app where the final specialization is much cheaper should show a larger % win, I'm guessing.
For your second category (RID or TFM-specific IFDEFs) we don't in the SDK make any RID-specific defines, which is one of the reasons why this might work! Of course for more complex projects there might be customization where the list of Compile inputs to the compiler might change per-RID - for that I don't think we can deliver much of a win here given how the build mechanisms are designed today. That might be worth investigating further - imagine a world where the platform-agnostic app contained all permutations of Compile inputs and it was the job of crossgen/linker/etc to eliminate entire files of code when specializing the agnostic compiled dll for the specific RID.
I wasn't proposing to focus the perf improvement on the second category, just that we need to make sure that it doesn't get disturbed since it is obviously an important and supported scenario. If there are perf wins to be had there, that could be good, but seems like a second phase of the project.
Ah gotcha - thanks for the clarification, totally agree that we would still need to support that kind of usage.
While I'm broadly supportive of the effort, I'm a bit worried about
I propose that we have two logical properties for intermediate outputs:
- IntermediatePlatformAgnosticOutputPath
- IntermediatePlatformSpecificOutputPath
Unfortunately, our intermediate output directories feed into a lot of different things. That makes the above changes pretty impactful in terms of visibility. I'm worried that this would greatly increase the MSBuild build logic complexity.
It's possible that the changes aren't that complex and maybe I have to just see a PR.
I think we should focus this on manifest based OCI image publishing (AKA multi-arch images). I think that will be a popular once Chet and co enable it. It is also an opportunity to provide an advantage over (and that is impossible for) Docker BuildKit. That said, BuildKit has other distinct advantages (like downloading base images in parallel at the start (not end) of the build. Lots to consider there.
The patch for the changes across the entire SDK layout looks something like this - note that this is aiming for maximal compatibility and changing only what is necessary to make my example repo share Csc compilations with no other changes:
You can apply this patch on top of a normal SDK layout (so from the directory containing dotnet
binaries, with the sdk
subfoldder) and try it for yourself. I've made a branch of my sample repro that has identical outputs when these changes are applied.
One big open question that I'm noodling over is how to handle PlatformTarget
. This is inferred from RID and passed to the compilers and so might defeat this entire idea, so I want to understand more how Roslyn uses/interprets this value.
how Roslyn uses/interprets this value
Talking to @jaredpar this is used to stamp a platform bit in the PE header and so is kinda necessary - though we might be able to add some smarts to the Csc task so that if only the PlatformTarget is changed it could fast-mode emitting a dll with a different PE header.
Talking to @jaredpar this is used to stamp a platform bit in the PE header and so is kinda necessary - though we might be able to add some smarts to the Csc task so that if only the PlatformTarget is changed it could fast-mode emitting a dll with a different PE header.
The task itself is stateless and it's hard to see how to add state to it. Particularly cause it could be executing across a number of different nodes and hence have different instances. I'm more wondering if we shouldn't just essentially send all three compilations as a single action. Basically think of calling Csc and saying "but do it for these three platform targtes" and then let the compiler decide the fast way to handle that.
That ends up putting more smarts in the compiler but maybe that isn't so bad.
Another idea might be to always build platform-agnostic and see if the publish pipeline could change the PE header bit when re-targeting the platform-agnostic dll to the destination RID.
Changing at publish time creates friction if you're strong naming.
PlatformTarget. This is inferred from RID and passed to the compilers
PlatformTarget
is left-over from .NET Framework. (PlatformTarget
can only represent architectures that .NET Framework targets. It does not represent full range of platforms that .NET Core can target.)
If you are targeting current .NET, building everything as AnyCPU (ie hardcoding PlatformTarget to AnyCPU) all the time should work just fine. It is what we are doing in dotnet/runtime repo. We hardcode PlatformTarget to AnyCPU, even for platform-specific binaries.
Thank you @jkotas! That's hugely clarifying.
What is the right path forward then?
The tests we've seen show that only platform target is different and that is meaningless in this context. Is that the full set of items that can be different? Basically do we feel confident that for multiple platforms we only need a single compilation event or is it just for the limited set where only platform target is different?
@jonathanpeppers FYI
It would be good to also test any changes like this with android and MacCat/iOS platforms. I believe android may do some special work to deal with multiple RIDs in their build targets...
@Redth
If you have some scenarios in mind could you do the following?
-bl
to the publish
command)We can diff the builds there and see if it matches what we're seeing so far.
Is your feature request related to a problem? Please describe.
Today a project with multiple
RuntimeIdentifiers
that is published for each of those RuntimeIdentifiers does a significant amount of re-work - that is, work that must be re-computed because some of the inputs of the overall compilation changed. After some research, I think I've identified the causes of most of the sources of rework and have a proposal for what we should do to eliminate a large amount of re-compilation and re-generation done when publishing for multiple platforms.As a broad sense of the potential gains, a single console project that Publishes ReadyToRun for
currently takes ~5s on my machine (2.25s of which is the Csc task across 2 invocations). With the changes proposed below this goes to 3.5s total (~1.25s in the Csc task across a single invocation).
Describe the solution you'd like
In general, for publishing using PublishReadyToRun there are two phases:
Each of these steps has a set of input and output data that, when out of date, triggers build Targets that generate the outputs from the inputs. After analysis of Roslyn compilations for different RIDs, it turns out the primary inputs and outputs share quite a few parameters across platforms. The differences that cause recompilation across the platforms seem to be:
If each of these can be rationalized to be more stable/fixed even in the face of different RIDs, then compiler invocations can be skipped and the outputs reused. In the most extreme case, if these features are disabled or pinned to a stable value, a single compilation can be reused for all platforms. You can see this in action in my sample repo.
A graph of the current state might look like this:
A graph of a proposed state might look like this:
Managing generated files
The generated files above don't seem to have any content that changes between a platform-independent and the various platform-specific builds - the reason they trigger rebuilds is because they implicitly rely on the
IntermediateOutputPath
property. For a RID-specific build this property contains the RID and so any file written to this location is assumed to be RID specific as well.I propose that we have two logical properties for intermediate outputs:
And the above inputs and outputs should be changed to be derived from
IntermediatePlatformAgnosticOutputPath
.Managing expected compilation outputs
Similar to the above, the compilers use the
IntermediateAssembly
andIntermediateRefAssembly
MSBuild Items, which are defined in theMicrosoft.Common.CurrentVersion.targets
as directly usingIntermediateOutputPath
- we should consider making them use one of the two above properties as well.Managing compiler inputs
The only varying property is
Platform
- and as of my investigations right now the default platform is AnyCPU. If this is satisfactory for the majority of managed .NET Applications then we could change RID-specific compilations to not pass Platform and remove this delta.Wrapping up
If we can eliminate these sources of variation in a principled way, and potentially expand the use of the two proposed properties throughout the rest of the SDK, we could potentially eliminate quite a lot of re-work for use cases we care about like multi-RID publishing.