dotnet / source-build

A repository to track efforts to produce a source tarball of the .NET Core SDK and all its components
MIT License
266 stars 132 forks source link

Design and implement Project SDK resolution that does not involve source-edits or SDK mutation. #3781

Open mmitche opened 11 months ago

mmitche commented 11 months ago

Right now, the source-build infra uses a special msbuild sdk resolver https://github.com/dotnet/dotnet/tree/main/eng/tools/tasks/SourceBuild.MSBuildSdkResolver to enable using the correct msbuild SDK version when building a repo. We need behavior that achieves this for unified build. Is this the best option? I dunno.

For instance, when you build arcade initially, you want to use the arcade version that the VMR has, not what Arcade's global.json specifies. When you build any repo after arcade, you want to use the newly built arcade SDK, not what that repo mentions in its global.json either.

The custom SDK resolver feels a little bit hacky? Maybe maybe not.

T-Shirt Size: Large

steveisok commented 11 months ago

The custom SDK resolver feels a little bit hacky? Maybe maybe not.

Are there things we can infer from the VMR build that would enable us to achieve the same result?

mmitche commented 11 months ago

I don't think so? I think the root of the issue is that SDKs are listed in global.json with their versions. The version of the SDK that we need to use in the build will be unknown until we build it.

ViktorHofer commented 10 months ago

The custom SDK resolver feels a little bit hacky? Maybe maybe not.

I agree that it is hacky. It copies binaries into the immutable .NET SDK directory. That means that developer won't be able to build with their globally installed SDK (when versions match). That would be very unfortunate as we worked really hard on supporting this in the individual repositories.

https://github.com/dotnet/dotnet/blob/2017edaa9105ca502bc09f6f9cd0346ae2fecdff/eng/tools/tasks/SourceBuild.MSBuildSdkResolver/SourceBuild.MSBuildSdkResolver.csproj#L9C59-L9C92 and https://github.com/dotnet/dotnet/blob/6934a8c588bf8627c9e60aec98c156c3e00bf203/Directory.Build.props#L54C30-L54C42

mmitche commented 10 months ago

Totally agree. IMO overriding of the versions of msbuild SDKs should be supported as first class concept.

ViktorHofer commented 10 months ago

Would we still need this if we would cloak the global.json file in the individual repos away and unify them into the root global.json one? Or alternatively, if we would update the global.json SDK entries?

mmitche commented 10 months ago

We would. Because once you build arcade, you want to use that arcade SDK downstream. And ideally you do so without updating the root global.json. Same goes with the Windows desktop SDK

mmitche commented 10 months ago

@marcpopMSFT @dsplaisted @rainersigwald Any opinions here? Some ideas:

ViktorHofer commented 10 months ago

Yes the second option definitely sounds better if it would mean that we could express the override versions inside msbuild before the SDK is fetched and used. That would allow us to not depend on env vars.

dsplaisted commented 10 months ago

SDK resolvers don't have access to MSBuild properties, so I think it would still need to be environment variable based. It would be great to have the support to override SDK resolution built in. That will require it to be more complicated to ensure it's performant and appropriately cached. I think @ladipro has been working on performance improvements for the MSBuild SDK resolvers.

ladipro commented 10 months ago

The default .NET SDK resolver already has override support using DOTNET_MSBUILD_SDK_RESOLVER_SDKS_DIR and DOTNET_MSBUILD_SDK_RESOLVER_SDKS_VER environment variables. They work only in MSBuild.exe, though, because this code is intentionally skipped in dotnet build. I am currently planning to make changes to:

If these two environment variables are sufficient to support your scenario I think it would make sense to add them to the dotnet build code path as part of the work.

ViktorHofer commented 10 months ago

@ladipro does those env vars also work for other msbuild sdks, not just Microsoft.NET.Sdk? In the example of the Microsoft.DotNet.Arcade.Sdk msbuild sdk (which is usually resolved via the nuget sdk resolver) on how this would work? How we would be able to override its version and the directory that it is looked up for?

ladipro commented 10 months ago

@ViktorHofer this would work for everything you put in the directory pointed to by DOTNET_MSBUILD_SDK_RESOLVER_SDKS_DIR. Basically the first thing MSBuild would do when resolving an SDK is check if this variable is defined. If it is and the directory contains a subdirectory matching the name of the SDK we're resolving, it would be used. DOTNET_MSBUILD_SDK_RESOLVER_SDKS_VER is used to indicate the version the resolved SDK is pretending to be.

So for Microsoft.DotNet.Arcade.Sdk specifically, it would override the NuGet because all this would happen before the NuGet resolver is called.

MichaelSimons commented 10 months ago

@mmitche - It sounds like this ENV would address the issue described in https://github.com/dotnet/source-build/issues/1532.

ViktorHofer commented 10 months ago

That sounds great and would solve this issue.

If these two environment variables are sufficient to support your scenario I think it would make sense to add them to the dotnet build code path as part of the work.

Agreed. Do you have a rough ETA on when this work would be ready?

ladipro commented 10 months ago

Agreed. Do you have a rough ETA on when this work would be ready?

This should be done by mid January. I still have a few days before the holidays but unless this is P1 for you, I will spend it finishing other work.

dsplaisted commented 10 months ago

It sounds like for the DOTNET_MSBUILD_SDK_RESOLVER_SDKS_DIR environment variable, you might need to copy the rest of the SDKs that are in the dotnet/sdk/<version>/Sdks folder into the same folder as the SDK you're trying to resolve to, otherwise SDKs such as Microsoft.NET.Sdk wouldn't resolve. Also, you wouldn't be able to specify a separate version for each SDK. Would that work?

ViktorHofer commented 10 months ago

Would it be possible to specify multiple directories in that env var? I.e. DOTNET_MSBUILD_SDK_RESOLVER_SDKS_DIR=<dotnet-sdk>;C:\git\dotnet\artifacts\live-msbuildsdks

ladipro commented 10 months ago

This would be new functionality (in dotnet build) so we can make it behave the way it fits your case. At least some parity with MSBuild.exe would be nice but I don't think is strictly required.

Have you considered using a local NuGet package source to restore the package from a local directory? That would let you use a custom SDK today in a less hacky / more supported way.

ViktorHofer commented 10 months ago

Have you considered using a local NuGet package source to restore the package from a local directory? That would let you use a custom SDK today in a less hacky / more supported way.

Right, using the NuGet SDK resolver would be sufficient for what we need BUT we still need a way to override the msbuild sdk version specified in the global.json, i.e. we need to use the live produced version of the Microsoft.DotNet.Arcade.Sdk package. Is there a way to change that version dynamically?

ladipro commented 10 months ago

Aaah, I think I understand now. For the override to work at NuGet level, you'd need to change at least the contents of NuGet.config, i.e. add a local package source, use a throw-away cache directory to force it to use the local version.. maybe more. But I guess writing to the source tree is undesirable, hence the need for an env-based solution.

mmitche commented 10 months ago

Right, exactly. As is mutating the local dotnet SDK.

mmitche commented 10 months ago

What does seem to be a general awkwardness is that while much of the build logic can be made customizable on the command line, via passing in global properties, global.json (and to a lesser extent, NuGet.config), are static data files that that require source level edits to customize.

steveisok commented 10 months ago

What does seem to be a general awkwardness is that while much of the build logic can be made customizable on the command line, via passing in global properties, global.json (and to a lesser extent, NuGet.config), are static data files that that require source level edits to customize.

With global.json, does that have more to do with being ingrained in the host?

mmitche commented 10 months ago

I think that's the case for some elements of global.json, but I would be surprised if the host knows about the msbuild SDK versions?

ViktorHofer commented 10 months ago

Right. The msbuild sdk section was added to global.json as they had to be stored somewhere outside of msbuild logic that doesn't require an evaluation. The host ignores that section.

ViktorHofer commented 10 months ago

So after more days thinking about this, here's my take:

We currently need to specify which msbuild sdks get live produced and should be extracted to an intermediate folder in the VMR so that subsequent repositories can depend on it. There are a lot of hardcodes involved (msbuild & env vars) and extracting the nuget packages leads to issues like wrong chmod permissions on Unix and is in-efficient. In addition to that, the current custom resolver is copied into the SDK toolset directory and with that mutates the immutable layout.

Ideally the nuget sdk resolver component would provide a hook (i.e. env var) that allows to override a requested msbuild sdk's version. That would work for all cases that we have today inside the VMR and lets us utilize the components/infrastructure that our customers use (common path), instead of building a custom SDK resolver that is hard to get right.

For in-built SDKs like Microsoft.NET.Sdk, switches are already exposed that make it possible to point to a specific local folder. As mentioned, we don't use a live Microsoft.NET.Sdk msbuild sdk in the VMR today, but probably want / should at some point.

ladipro commented 9 months ago

Ideally the nuget sdk resolver component would provide a hook (i.e. env var) that allows to override a requested msbuild sdk's version. That would work for all cases that we have today inside the VMR and lets us utilize the components/infrastructure that our customers use (common path), instead of building a custom SDK resolver that is hard to get right.

Tagging @jeffkl to comment on this.

For in-built SDKs like Microsoft.NET.Sdk, switches are already exposed that make it possible to point to a specific local folder. As mentioned, we don't use a live Microsoft.NET.Sdk msbuild sdk in the VMR today, but probably want / should at some point.

Would you still like to have DOTNET_MSBUILD_SDK_RESOLVER_SDKS_DIR supported in dotnet build now, or better leave this for when it's actually needed? This would be easy to include in the PR I'm working on, though it's sort of a public contract so ideally we would have thought it through.

jeffkl commented 9 months ago

The NuGet-based MSBuild project SDK resolver gets its configuration from the NuGet.config and the package ID/version from either the project or global.json. As far as I know, overriding the package feeds is only possible by changing the NuGet.config. At least if the MSBuild project SDK versions are in global.json, changing them in the repo at build time would also work. But if an individual project had something like <Project Sdk="SomeCustomSdk/1.0.0"> that would obviously be harder.

If we wanted to completely override the NuGet-based MSBuild project SDK resolver with environment variables, that's just a feature we'd have to add. The package feeds and SDKs are potentially more structured data than an environment variable could handle but we can design that if needed.

SET NUGET_OVERRIDE_PACKAGE_SOURCES=https://myfeed/v3/index.json;https://myotherfeed/v3/index.json
SET NUGET_OVERRIDE_MSBUILD_SDKS=SomeSdk=1.0.0;SomeOtherSdk=2.0.0

Does the build system know ahead of time all of the MSBuild project SDKs and versions and only want those specified in an env var to be made available? Or would the overrides just apply to the ones that are known?

ViktorHofer commented 9 months ago

Would you still like to have DOTNET_MSBUILD_SDK_RESOLVER_SDKS_DIR supported in dotnet build now, or better leave this for when it's actually needed?

Ideally we would enable that scenario already now so that we wouldn't be blocked when we decide to or realize that we need to use the live Microsoft.NET.Sdk.

Does the build system know ahead of time all of the MSBuild project SDKs and versions and only want those specified in an env var to be made available? Or would the overrides just apply to the ones that are known?

The latter. We need to override msbuild sdk versions specified in global.json / directly in a project via the <Sdk /> tag for the msbuild sdks that we live build in the VMR.

If we wanted to completely override the NuGet-based MSBuild project SDK resolver with environment variables, that's just a feature we'd have to add. The package feeds and SDKs are potentially more structured data than an environment variable could handle but we can design that if needed.

We were thinking of having an env var per msbuild sdk and make the name similar to the output of NuGet's $(GeneratePathProperty). I.e. Microsoft.DotNet.Arcade.Sdk would become NUGET_OVERRIDE_MSBUILD_SDK_MICROSOFT_DOTNET_ARCADE_SDK. Does that make sense?

ViktorHofer commented 9 months ago

@jeffkl the VMR effort would strongly benefit from this feature. Another ask that recently came up is that the nuget sdk resolver should support the RestoreConfigFile property (if property isn't possible then a corresponding environment variable).

Should we file issues in nuget/home for those two asks or is this one sufficient to the track them?

jeffkl commented 9 months ago

@ViktorHofer we have an existing issue about the SDK resolver not being able to respect MSBuild properties: https://github.com/NuGet/Home/issues/7855

Feel free to file a new one about having it also respect versions from environment variables.

ViktorHofer commented 8 months ago

Filed https://github.com/NuGet/Home/issues/13301