dotnet / sdk-container-builds

Libraries and build tooling to create container images from .NET projects using MSBuild
https://learn.microsoft.com/en-us/dotnet/core/docker/publish-as-container
MIT License
179 stars 35 forks source link

Design multi-manifest (aka multi-architecture or multi-RID) publishing #87

Open baronfel opened 2 years ago

baronfel commented 2 years ago

It's possible for containers to be specified in a 'manifest list' - a set of container image manifests that represent the same application on different underlying OS/hardware configurations.

Fundamentally this would be something like a multitargeted build. For some selection of OS/OSVersions and Architectures we'd need to orchestrate

then, once all of those were done, we'd need to

There are a couple hurdles we'd need to cover:

Other requirements:

Proposal

The gesture we want users to perform for multi-arch manifest publishing is

dotnet publish -t:PublishContainer

i.e. the same gesture they use today. To do this, we should change the implementation of the current PublishContainer Target from its current behavior of 'publish a single image for a single RID' to more of a decision-making target.

PublishContainer should

  • check if the project is currently in multi-TFM state. if so, error out. we require specifying a single specific TFM for now.
  • check if the project is in a 'multi RID' state - meaning the project does not have a RuntimeIdentifier specified and does have either ContainerRuntimeIdentifiers or RuntimeIdentifiers specified. If so, invoke a new "_BuildMultiImageManifest" target
  • otherwise, the project is in a single-RID, single-TFM state. In this state, invoke a new "_BuildSingleContainer" Target whose behavior is exactly the same as the single-image version of PublishContainer today.

An example of this per-scenario break-out is here.

Anticipated hurdles

Defaulted RIDs

The SDK does not have a concept of 'multi-RID' publish, and so today there are several places where it has assumed that the publish gesture implicates the desire for a single RID. The main way this negatively impacts us is the UseCurrentRuntimeIdentifier property, which is inferred as true here and ends up erroneously pinning us to a single RID. Setting it explicitly to false in the project files works around this.

PublishSingleFile

If PublishSingleFile is set and UseCurrentRuntimeIdentifier is not (as mentioned above), there is a mismatch in expectations. For now, for scenarios like our initial set, users may have to condition properties to only light up when the RID-specific build(s) are being done (for example, adding a Condition="'$(RuntimeIdentifier)' != ''" to several properties.

This is a symptom of the overall Publishing mechanisms of the .NET SDK not being designed for multi-RID publish today. In general, I think many SDK checks could be deferred to the 'inner RID' builds with no loss of intent, but we may have to push for this functionality in phases.

BuildMultiImageManifest

This target broadly should do two things

Ideally, it would also unify any shared work that may happen during the multiple single-RID publishes into one unit of work that is shared. A specific example of this is

Characteristics of the Manifest List

Visual Aids

flowchart TD
    A[Start Build] --> B[dotnet publish -t:PublishContainer]
    B --> D[Publish for linux-x64]
    D --> E[Package a linux-x64 container]
    B --> F[Publish for linux-arm64]
    F --> G[Package a linux-arm64 container]
    G --> I[Package both containers into an image index]
    E --> I
    I --> H[Push containers and image index to registry]

Work Stages/Milestones

We should have two phases of the work - initial MVP and then productizing.

Initial MVP

In this stage we implement the multi-RID aware publishing feature with external registries as the primary destination - so no pushing to local daemons or exporting to tarballs. This is the most well-known area of development. Once this is implemented, we can hand a preview nupkg over to the internal partner teams that want to test the feature so they can begin validation.

Productizing

In this stage we would implement tarball export and local-Daemon export of manifest lists, as well as full testing and error handling scenarios.

baronfel commented 1 year ago

I discussed this at MVP Summit this year, and feedback from folks was strong - we should do this so that we have parity with other ecosystems.

baronfel commented 1 year ago

There are two separate requirements here:

The former is relatively straightforward today. We essentially want to 'multi-target' like you would with a TFM, but with RIDs:

A Directory.Build.targets file with a Containerize target that enables multi-rid container generation ```xml <_RequiredContainerPublishTargets>Publish;PublishContainer <_TFMItems Include="$(TargetFrameworks)" /> <_SingleContainerPublish Include="$(MSBuildProjectFullPath)" AdditionalProperties="TargetFramework=%(_TFMItems.Identity); VersionSuffix=$([MSBuild]::GetTargetFrameworkVersion('%(_TFMItems.Identity)', 2))" /> <_SingleContainerPublish Include="$(MSBuildProjectFullPath)" /> <_RIDItems Include="$(RuntimeIdentifiers)" /> <_SingleContainerPublish Include="$(MSBuildProjectFullPath)" AdditionalProperties="ContainerRuntimeIdentifier=%(_RIDItems.Identity); RuntimeIdentifier=%(_RIDItems.Identity); VersionSuffix=%(_RIDItems.Identity);" /> ```

Adding this target to a project lets you run dotnet build /t:Containerize and generate architecture- and platform-specific images. We should look at including something like this in the official build targets for the 7.0.400 time frame if at all possible. Adding such a target also enables a related use case: containerizing every project in a solution that can be containerized. This enables workflows like dotnet build /t:Containerize myapp.sln && docker-compose up, where there is a compose.yaml that specifies the relationships between the services in the usual way, just using image: instead of build: stanzas for the project.

A worked example of this can be seen with this diff of the eshoponcontainers project. The docker compose YAML specifically is a useful example.

mu88 commented 11 months ago

I would love seeing this feature, as it's literally the last missing piece from throwing away my Dockerfiles and replacing them with the SDK Container Building Tools.

baronfel commented 5 months ago

Making multi-architecture images is pretty straightforward, as shown above. The next step is creating image manifests using those images. There's an example of this in my sdk-container-demo repository here that builds upon the snippet above by:

Our tooling doesn't yet speak these manifests, but it could learn to.

mu88 commented 5 months ago

That looks very promising @baronfel !

This of course sparks hope šŸ˜‰ what's missing from adding it to the Container Building Tools?

mu88 commented 5 months ago

So I gave @baronfel's prototype a try yesterday and it works nicely. The icing on the cake (despite being integrated into the SDK Container Building Tools) would be if pushing the arch-specific images to the container registry wouldn't be necessary - I prefer my build process not to rely on external things like a foreign container registry. Instead, it would be cool to build the multi-arch image completely within one's local environment.

Varorbc commented 3 months ago

@baronfel any updates?

baronfel commented 3 months ago

No, not as of yet. This probably won't make it for 8.0.400, but it is our highest-rated request so we do want to get to it!

Varorbc commented 3 months ago

Is there a plan for when to start the work?

Varorbc commented 3 months ago

@baronfel any updates?

baronfel commented 2 months ago

We are looking at taking this work on in the near term.

richlander commented 2 months ago

check if the project is in a 'multi RID' state - meaning the project does not have a RuntimeIdentifier specified and does have either ContainerRuntimeIdentifiers or RuntimeIdentifiers specified.

I'm not a fan of RuntimeIdentifiers since it (at least in theory) affects build. We should be focusing new functionality on publish time properties. This is similar to our old friend SelfContained, but I think we created PublishSelfContained for that.

I think many SDK checks could be deferred to the 'inner RID' builds with no loss of intent,

I like the idea of an inner-RID build, where one RID is set as a simplifying approach.

I know that MAUI had this same desire at one point, but perhaps it was satisfied via the TFMs that were created for them.

mu88 commented 2 months ago

To not c&p the same MSBuild draft logic (kudos again to @baronfel šŸ„³) into several of my .NET apps, I added it to my NuGet package mu88.Shared (see here for the sources).
I've added it to several of my .NET apps targeting both x64 and arm64 and it works nicely šŸ¤“

baronfel commented 2 months ago

I'm not a fan of RuntimeIdentifiers since it (at least in theory) affects build. We should be focusing new functionality on publish time properties. This is similar to our old friend SelfContained, but I think we created PublishSelfContained for that.

Generally agree - that's why for this iteration the first-checked property would be ContainerRuntimeIdenfiers. We could easily drop consideration of RuntimeIdentifiers, but I'd like to encourage people to at least have that property set since that's the property that Restore actually keys off of, and as much as possible I'd like to avoid breaking some of the implicit assumptions that dotnet build && dotnet publish --no-restore --no-build -r <something> promises.

I like the idea of an inner-RID build, where one RID is set as a simplifying approach.

I know that MAUI had this same desire at one point, but perhaps it was satisfied via the TFMs that were created for them.

cc @jonathanpeppers for comment, but from my digging I think MAUI still broadly use RIDs in their publishing workflows. Examples here for calculating which then turns into a set of MSBuild Projects which are then reused in at least AOT publishing but possibly elsewhere as well.

In addition @jonathanpeppers has requested better SDK-level support for managing RIDs in https://github.com/dotnet/sdk/issues/37830.

richlander commented 2 months ago

I forgot about 'restore'. That said, I think it is still problematic. Let's talk this one through. I think we last discussed this one at length about 5 years ago.

I would also like to think this through to a broader set of scenarios. The key one I have in mind is native AOT, which requires an additional toolset and is easiest with build containers. The buildx behavior enables it quite well.

I don't think we need to build the perfect solution from the get-go, but we should ensure we know where we are heading.

jonathanpeppers commented 2 months ago

Android apps unfortunately have four RIDs (arm, arm64, x86, x64), and Mac apps have two (x64, arm64). iOS debug builds could have two if you build for simulator and device.

What we do currently is use $(RuntimeIdentifiers) with an s and do an "inner build" in a similar fashion as $(TargetFrameworks) with an s and gather the outputs into the app package. This runs the trimmer per architecture, and the AOT compiler per architecture. Right now, we have this logic in each platform's workload, as there wasn't anything built into the .NET SDK for this. I think this could be improved, but what we have has been working ok for customers.

baronfel commented 2 months ago

Thanks @jonathanpeppers - that matches with the plan here. I do agree that we need some better concept built into the SDK (and maybe even NuGet for per-TFM-per-RID targets?!). How did you all deal with some of the hurdles for the publish properties that assume a default RID when publishing at a RID-less level? From the issue description I mean things like:

Defaulted RIDs

The SDK does not have a concept of 'multi-RID' publish, and so today there are several places where it has assumed that the publish gesture implicates the desire for a single RID. The main way this negatively impacts us is the UseCurrentRuntimeIdentifier property, which is inferred as true here and ends up erroneously pinning us to a single RID. Setting it explicitly to false in the project files works around this.

PublishSingleFile

If PublishSingleFile is set and UseCurrentRuntimeIdentifier is not (as mentioned above), there is a mismatch in expectations. For now, for scenarios like our initial set, users may have to condition properties to only light up when the RID-specific build(s) are being done (for example, adding a Condition="'$(RuntimeIdentifier)' != ''" to several properties.

This is a symptom of the overall Publishing mechanisms of the .NET SDK not being designed for multi-RID publish today. In general, I think many SDK checks could be deferred to the 'inner RID' builds with no loss of intent, but we may have to push for this functionality in phases.

when doing dotnet publish ... with no specific RID requested.

jonathanpeppers commented 2 months ago

Some of the behavior mentioned above, we had to turn off. Android opts out of $(UseCurrentRuntimeIdentifier), for example:

The approach we took for Android, was to default to set $(RuntimeIdentifiers) to all 4 by default when $(RuntimeIdentifier) is omitted. A customer might not ever set a RID or have to know about them. To make Debug builds reasonable, we detect the RID based on the attached device (or selected device in VS/C# Dev Kit). This way Debug-mode can just build one instead of four.

Since Mac is the only other platform with multiple RIDs (2) and they are not cross-compiling, $(UseCurrentRuntimeIdentifier) works for that case. They also made Release builds default to two architectures by default.