dotnet / sdk

Core functionality needed to create .NET Core projects, that is shared between Visual Studio and CLI
https://dot.net/core
MIT License
2.7k stars 1.06k forks source link

[6.0.10x] NuGet version incoherency between sdk and templating #28107

Closed uweigand closed 2 years ago

uweigand commented 2 years ago

When building .NET SDK 6.0.107 or 6.0.108 on s390x (which uses Mono as the default runtime), every dotnet new immediately fails with:

Could not load type of field 'Microsoft.TemplateEngine.Edge.Installers.NuGet.NuGetApiPackageManager:_nugetLogger' (1) due to: Could not load file or assembly 'NuGet.Common, Version=6.0.0.280, Culture=neutral, PublicKeyToken=31bf3856ad364e35' or one of its dependencies.
   at Microsoft.TemplateEngine.Edge.Installers.NuGet.NuGetInstallerFactory.CreateInstaller(IEngineEnvironmentSettings settings, String installPath)
   at Microsoft.TemplateEngine.Edge.BuiltInManagedProvider.GlobalSettingsTemplatePackageProvider..ctor(GlobalSettingsTemplatePackageProviderFactory factory, IEngineEnvironmentSettings settings)
   at Microsoft.TemplateEngine.Edge.BuiltInManagedProvider.GlobalSettingsTemplatePackageProviderFactory.CreateProvider(IEngineEnvironmentSettings settings)
   at Microsoft.TemplateEngine.Edge.Settings.TemplatePackageManager.<EnsureProvidersLoaded>b__21_0(ITemplatePackageProviderFactory f)
   at System.Linq.Enumerable.SelectEnumerableIterator`2[[Microsoft.TemplateEngine.Abstractions.TemplatePackage.ITemplatePackageProviderFactory, Microsoft.TemplateEngine.Abstractions, Version=6.0.108.0, Culture=neutral, PublicKeyToken=adb9793829ddae60],[Microsoft.TemplateEngine.Abstractions.TemplatePackage.ITemplatePackageProvider, Microsoft.TemplateEngine.Abstractions, Version=6.0.108.0, Culture=neutral, PublicKeyToken=adb9793829ddae60]].MoveNext()
   at Microsoft.TemplateEngine.Edge.Settings.TemplatePackageManager.EnsureProvidersLoaded()
   at Microsoft.TemplateEngine.Edge.Settings.TemplatePackageManager.GetTemplatePackagesAsync(Boolean force, CancellationToken cancellationToken)
   at Microsoft.TemplateEngine.Edge.Settings.TemplatePackageManager.UpdateTemplateCacheAsync(Boolean needsRebuild, CancellationToken cancellationToken)
   at Microsoft.TemplateEngine.Edge.Settings.TemplatePackageManager.GetTemplatesAsync(CancellationToken cancellationToken)
   at Microsoft.TemplateEngine.Cli.TemplateResolution.BaseTemplateResolver.GetTemplateGroupsAsync(CancellationToken cancellationToken)
   at Microsoft.TemplateEngine.Cli.TemplateResolution.InstantiateTemplateResolver.ResolveTemplatesAsync(INewCommandInput commandInput, String defaultLanguage, CancellationToken cancellationToken)
   at Microsoft.TemplateEngine.Cli.TemplateInvocationCoordinator.CoordinateInvocationAsync(INewCommandInput commandInput, CancellationToken cancellationToken)
   at Microsoft.TemplateEngine.Cli.New3Command.EnterTemplateManipulationFlowAsync(INewCommandInput commandInput)
   at Microsoft.TemplateEngine.Cli.New3Command.ExecuteAsync(INewCommandInput commandInput)
   at Microsoft.TemplateEngine.Cli.New3Command.ActualRun(String commandName, ITemplateEngineHost host, ITelemetryLogger telemetryLogger, New3Callbacks callbacks, String[] args, String hivePath)

It turns out this is because the Microsoft.TemplateEngine.Edge.dll provided with SDK 6.0.108 has a reference to version 6.0.0 of NuGet.Common.dll, while SDK 6.0.108 actually provides version 6.0.2-rc5.

https://github.com/dotnet/templating/blob/733440663a189f4c4de64f53bab82fd9382b9a3a/eng/Version.Details.xml#L42

    <Dependency Name="NuGet.Credentials" Version="6.0.0">
      <Uri>https://github.com/nuget/nuget.client</Uri>
      <Sha>078701b97eeef2283c1f4605032b5bcf55a80653</Sha>
    </Dependency>

https://github.com/dotnet/sdk/blob/17ea4a71a1adac805a17c272977139ddc6200bf9/eng/Version.Details.xml#L124

    <Dependency Name="NuGet.Build.Tasks" Version="6.0.2-rc.5">
      <Uri>https://github.com/nuget/nuget.client</Uri>
      <Sha>75551652b352f860ea0b29095b64fa63715dd672</Sha>
    </Dependency>

This doesn't seem to be caused by a problem in our builds; the assemblies contained in the official dotnet-sdk-6.0.108-linux-x64.tar.gz tarball have the same issue, where Microsoft.TemplateEngine.Edge.dll has a dependency on NuGet.Common.dll with assembly version 6.0.0.208, but the provided NuGet.Common.dll has assembly version 6.0.2.5.

I assume the reason we're not seeing the symptom on x86 is once again the difference in loader behavior between CoreCLR and Mono that we've run into in the past (e.g. https://github.com/dotnet/runtime/issues/60550): if the type of a struct/class field is defined in some other assembly, CoreCLR will only try to load that assembly when that field is accessed, but Mono will already try to load the assembly when the struct/class type is constructed in the first place.

The use of a 6.0.2-rc version of NuGet was introduced here initially: https://github.com/dotnet/sdk/commit/5d63f71c200563163e0fcff8f4a77c9e60b428f1 Should the templating repository now match that to ensure version coherency across the SDK?

CC @erdembayar @mmitche @omajid

marcpopMSFT commented 2 years ago

@dotnet/templating-engine-maintainers can you take a look?

vlada-shubina commented 2 years ago

@uweigand templating doesn't necessarily require 6.0.0; it is a minimum required version. I believe when sdk is built, all the packages are built for sdk version of NuGet. Is there a way to achieve that for source-build?

Thanks

uweigand commented 2 years ago

@uweigand templating doesn't necessarily require 6.0.0; it is a minimum required version. I believe when sdk is built, all the packages are built for sdk version of NuGet. Is there a way to achieve that for source-build?

Just to clarify - this is not a source-build issue. In fact, with source-build, everything works fine, because all components are built using the same version numbers everywhere.

The problem occurs with the "normal" package-based build. Here, the templating package is built separately from the sdk package, and those two currently depend on different versions of NuGet. During the sdk build, the final templating nuget package is simply copied into the resulting sdk layout, but not actually rebuilt.

Looking at the recent dotnet-sdk-6.0.109-linux-x64.tar.gz for example, in the sdk/6.0.109 directory we find a Microsoft.TemplateEngine.Edge.dll assembly (which was copied out of the templating package) with the following dependency (as seen via ildasm):

.assembly extern NuGet.Common
{
  .publickeytoken = (31 BF 38 56 AD 36 4E 35 )                         // 1.8V.6N5
  .ver 6:0:0:280
}

However, the NuGet.Common.dll that is present in that same directory shows the following assembly version:

.assembly NuGet.Common
{
[...]
  .ver 6:0:2:5
}

This mismatch leads to the

Could not load file or assembly 'NuGet.Common, Version=6.0.0.280, Culture=neutral, PublicKeyToken=31bf3856ad364e35' or one of its dependencies.

error mentioned above.

vlada-shubina commented 2 years ago

Apologies, I thought it is related to source build.

It is true that the versions in templating and sdk differ, with templating being behind, and it is known and intentional. sdk is not only use of templating - we also publish it to NuGet and it cannot depend on preview version of NuGet. However looking at official build it works just fine and with higher version assembly being resolved on the build.

We need more information to investigate the issue.

Thank you

uweigand commented 2 years ago

Thanks for looking into this! It seems I had misinterpreted the situation initially.

Looking into this again, the underlying root cause does indeed appear to be a bug in the Mono loader, which incorrectly causes it to conclude that the available DLL with version 6.0.2.5 is older than the requested version 6.0.0.280. This is because it decides based upon 280 > 5 instead of 0 < 2.

I've now opened an issue in the runtime repository to track this problem. With that loader version comparison bug fixed, the problem described here does indeed go away. Closing this issue now.