Open alhad-deshpande opened 2 months ago
To add some additional context, just before the hang we see this error:
Could not load signature of Microsoft.CodeAnalysis.CSharp.CSharpProjectFileLoader:CreateProjectFile due to: Could not load file or assembly 'Microsoft.Build, Version=15.1.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a' or one of its dependencies.
The issue isn't specific to ppc64le / s390 / cross-compilation. It reproduces also with building runtime tests with an x64 mono runtime on x64.
cc @lambdageek
We've done a bit more investigation on where this error is hit. We're running
dotnet
/root/.nuget/packages/microsoft.dotnet.hotreload.utils.generator.buildtool/9.0.0-alpha.0.24168.2/build/../tools/net9.0/Microsoft.DotNet.HotReload.Utils.Generator.BuildTool.dll
-msbuild:/root/alhad/runtime/src/libraries/System.Runtime.Loader/tests/ApplyUpdate/System.Reflection.Metadata.ApplyUpdate.Test.FirstCallAfterUpdate/System.Reflection.Metadata.ApplyUpdate.Test.FirstCallAfterUpdate.csproj
-script:/root/alhad/runtime/src/libraries/System.Runtime.Loader/tests/ApplyUpdate/System.Reflection.Metadata.ApplyUpdate.Test.FirstCallAfterUpdate/deltascript.json
-p:Configuration=Debug
which ends up calling the PrepareMSBuildProject
routine here https://github.com/dotnet/hotreload-utils/blob/c667b519c4bcdf121106fac021a53d98120e825a/src/Microsoft.DotNet.HotReload.Utils.Generator/BaselineProject.cs#L23
which calls msw.OpenProjectAsync
.
This creates a new "BuildHost" process running
dotnet --roll-forward LatestMajor
/root/.nuget/packages/microsoft.dotnet.hotreload.utils.generator.buildtool/9.0.0-alpha.0.24168.2/tools/net9.0/BuildHost-netcore/Microsoft.CodeAnalysis.Workspaces.MSBuild.BuildHost.dll
--property Configuration=Debug
and uses the Microsoft.CodeAnalysis.MSBuild.Rpc.RpcServer:ProcessRequestAsync
method to invoke the Microsoft.CodeAnalysis.Workspaces.MSBuild.BuildHost.BuildHost:LoadProjectFileAsync
routine there, which tries to create an instance of Microsoft.CodeAnalysis.CSharp.CSharpProjectFileLoader:CreateProjectFile
here: https://github.com/dotnet/roslyn/blob/fa7ece1ca25ef529db616ed6c839af9f027fa216/src/Workspaces/Core/MSBuild.BuildHost/BuildHost.cs#L180
This ends up in the Mono JIT attempting to load and initialize that type, which fails with:
Could not load signature of Microsoft.CodeAnalysis.CSharp.CSharpProjectFileLoader:CreateProjectFile due to:
Could not load file or assembly 'Microsoft.Build, Version=15.1.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a' or one of its dependencies.
causing
An exception of type System.TypeLoadException was thrown:
VTable setup of type Microsoft.CodeAnalysis.CSharp.CSharpProjectFileLoader failed
The Mono JIT previously had attempted to search for Microsoft.Build
unsuccessfully in these paths:
/root/.nuget/packages/microsoft.dotnet.hotreload.utils.generator.buildtool/9.0.0-alpha.0.24168.2/tools/net9.0/BuildHost-netcore/Microsoft.Build.dll
/root/.nuget/packages/microsoft.dotnet.hotreload.utils.generator.buildtool/9.0.0-alpha.0.24168.2/tools/net9.0/BuildHost-netcore/Microsoft.Build.exe
/root/.nuget/packages/microsoft.dotnet.hotreload.utils.generator.buildtool/9.0.0-alpha.0.24168.2/tools/net9.0/BuildHost-netcore/Microsoft.Build/Microsoft.Build.dll
/root/.nuget/packages/microsoft.dotnet.hotreload.utils.generator.buildtool/9.0.0-alpha.0.24168.2/tools/net9.0/BuildHost-netcore/Microsoft.Build/Microsoft.Build.exe
Now, there's still a number of questions:
First of all, why is Microsoft.Build.dll
not packaged with the microsoft.dotnet.hotreload.utils.generator.buildtool
nupkg, even though a number of other Microsoft.Build assemblies are? If I'm understanding this correctly, that could be an effect of the ExcludeAssets="runtime"
attribute here: https://github.com/dotnet/hotreload-utils/blob/c667b519c4bcdf121106fac021a53d98120e825a/src/Microsoft.DotNet.HotReload.Utils.Generator/Microsoft.DotNet.HotReload.Utils.Generator.csproj#L12 I'm not sure why this was added?
Second, if Microsoft.Build.dll
is not found there, should it be taken from the host .NET SDK? The file is available there, but the Mono JIT doesn't even appear to look there. Not sure if this is as intended, or if there are still some Mono loader search order issues.
Finally, why does this error only show up in Mono? Either it is because of the above-mentioned search order, or else it might be another instance of the problem where Mono is more eager in requiring to load types during JIT that may not actually be referenced later (where CoreCLR would defer the type loading to a later stage, which might not ever happen in this case).
Any suggestion on how to proceed here would be appreciated!
CC @giritrivedi @directhex @akoeplinger @lambdageek
@YuliiaKovalova -- would you be able to take a look at this issue and advise?
cc: @rainersigwald
First of all, why is
Microsoft.Build.dll
not packaged with themicrosoft.dotnet.hotreload.utils.generator.buildtool
nupkg
MSBuild assemblies should basically always come from the .NET SDK, not be repackaged by tools--MSBuild logic tends to make a bunch of assumptions about where .props/targets/etc files are relative to the loaded MSBuild DLLs, and it's also best to load the exact SDK that would be involved in dotnet build
.
Second, if
Microsoft.Build.dll
is not found there, should it be taken from the host .NET SDK?
So yes. That's what MSBuildLocator does: it adds some AssemblyLoad
event hooks to help find the right files in the SDK.
Finally, why does this error only show up in Mono? Either it is because of the above-mentioned search order, or else it might be another instance of the problem where Mono is more eager in requiring to load types during JIT that may not actually be referenced later (where CoreCLR would defer the type loading to a later stage, which might not ever happen in this case).
The latter behavior sounds very interesting. On .NET Framework and coreclr, the load is triggered by JIT compilation of the first method containing a type from Microsoft.Build.dll
. Locator thus requires that you call Register()
from a method that is JITed before calling the first such method.
If Mono is more aggressive about trying to load the type (say when the class containing such a method is first seen, rather than the method itself), that might explain the problem
Second, if
Microsoft.Build.dll
is not found there, should it be taken from the host .NET SDK?So yes. That's what MSBuildLocator does: it adds some
AssemblyLoad
event hooks to help find the right files in the SDK.Finally, why does this error only show up in Mono? Either it is because of the above-mentioned search order, or else it might be another instance of the problem where Mono is more eager in requiring to load types during JIT that may not actually be referenced later (where CoreCLR would defer the type loading to a later stage, which might not ever happen in this case).
The latter behavior sounds very interesting. On .NET Framework and coreclr, the load is triggered by JIT compilation of the first method containing a type from
Microsoft.Build.dll
. Locator thus requires that you callRegister()
from a method that is JITed before calling the first such method.If Mono is more aggressive about trying to load the type (say when the class containing such a method is first seen, rather than the method itself), that might explain the problem
Thanks for the details! This does indeed look very similar to problems we've had in the past.
Here's a PR from a while back describing one of these problems, where some assembly was also supposed to the found via an event hook, but when using Mono the JIT already was trying (and failing) to load a type defined in that assembly before that hook was even installed: https://github.com/dotnet/runtime/issues/60550#issuecomment-946709185
In this case, the JIT also aborted when building a routine that merely refers to the class that contains an element of that type.
Do you think there is a way to rewrite the msbuild code so it avoids accessing even the containing class type before the handler is set up? Or can you suggest any other way to resolve this issue?
No, I don't think there's a way to rewrite MSBuild or MSBuildLocator here, unfortunately.
What we need is:
It might be possible to change microsoft.dotnet.hotreload.utils.generator.buildtool
to call Locator "even earlier" (or rather call MSBuild through more class indirection) so that the call to Register()
happens before the runtime tries to load MSBuild types? If there's a good pattern for that we could certainly document it for Locator users.
Hmm, looks like there is another twist - we're dealing with two distinct dotnet processes here. The main buildtool
process runs:
public async Task<BaselineArtifacts> SetupBaseline (EnC.EditAndContinueCapabilities capabilities, CancellationToken ct = default) {
InitMSBuild();
BaselineProject baselineProject = await Microsoft.DotNet.HotReload.Utils.Generator.BaselineProject.Make (config, capabilities, ct);
where the first line calls
private static void InitMSBuild ()
{
Microsoft.Build.Locator.MSBuildLocator.RegisterDefaults();
}
and the second line calls
static async Task<(EnC.ChangeMakerService, Solution, ProjectId)> PrepareMSBuildProject (Config config, EnC.EditAndContinueCapabilities capabilities, CancellationToken ct = default)
{
Microsoft.CodeAnalysis.MSBuild.MSBuildWorkspace msw;
[...]
msw = Microsoft.CodeAnalysis.MSBuild.MSBuildWorkspace.Create(props);
[...]
var project = await msw.OpenProjectAsync (config.ProjectPath, logger, null, ct);
and that last line starts a new process (as described in my comment above).
So we're calling the MSBuildLocator
in the old process, but it is actually the new process where the JIT aborts due to the missing assembly. How is this supposed to work? A load hook set in the old process wouldn't be available in the new process, right? I see the MSBuildLocator
also sets some environment variables, but I'm not sure I understand properly how these are intended to be used by child processes - should the child install its own load hook somewhere?
ah, that is very interesting-- @jasonmalinowski does this ring any bells for you?
Description
Dotnet build hangs while building libs.tests natively using cross built dotnet sdk.
Brief Analysis and probable root cause:
Complete mono trace log for Microsoft.DotNet.HotReload.Utils.Generator namespace: [root@devfirst-dotnet-rhel8-8-1 logs]# export MONO_ENV_OPTIONS="--trace=N:Microsoft.DotNet.HotReload.Utils.Generator" [root@devfirst-dotnet-rhel8-8-1 logs]# /root/alhad/runtime/dotnet-sdk-ppc64le/.dotnet/dotnet /root/.nuget/packages/microsoft.dotnet.hotreload.utils.generator.buildtool/9.0.0-alpha.0.24168.2/build/../tools/net9.0/Microsoft.DotNet.HotReload.Utils.Generator.BuildTool.dll -msbuild:/root/alhad/runtime/src/libraries/System.Runtime.Loader/tests/ApplyUpdate/System.Reflection.Metadata.ApplyUpdate.Test.FirstCallAfterUpdate/System.Reflection.Metadata.ApplyUpdate.Test.FirstCallAfterUpdate.csproj -script:/root/alhad/runtime/src/libraries/System.Runtime.Loader/tests/ApplyUpdate/System.Reflection.Metadata.ApplyUpdate.Test.FirstCallAfterUpdate/deltascript.json -p:Configuration=Debug [0x7fff8a9a5160: 0.00000 0] ENTER:c Microsoft.DotNet.HotReload.Utils.Generator.Config:Builder ()() [0x7fff8a9a5160: 0.00019 0] LEAVE:c Microsoft.DotNet.HotReload.Utils.Generator.Config:Builder ()([.ConfigBuilder:0x7fff7c402bc8] [0x7fff8a9a5160: 0.04096 0] ENTER:c Microsoft.DotNet.HotReload.Utils.Generator.MsbuildConfig:.ctor (Microsoft.DotNet.HotReload.Utils.Generator.Config/ConfigBuilder)(this:0x7fff7c405820[Microsoft.DotNet.HotReload.Utils.Generator.MsbuildConfig Microsoft.DotNet.HotReload.Utils.Generator.BuildTool.dll], [.ConfigBuilder:0x7fff7c402bc8]) [0x7fff8a9a5160: 0.04116 1] ENTER:c Microsoft.DotNet.HotReload.Utils.Generator.Config:.ctor (Microsoft.DotNet.HotReload.Utils.Generator.Config/ConfigBuilder)(this:0x7fff7c405820[Microsoft.DotNet.HotReload.Utils.Generator.MsbuildConfig Microsoft.DotNet.HotReload.Utils.Generator.BuildTool.dll], [.ConfigBuilder:0x7fff7c402bc8]) [0x7fff8a9a5160: 0.04145 1] LEAVE:c Microsoft.DotNet.HotReload.Utils.Generator.Config:.ctor (Microsoft.DotNet.HotReload.Utils.Generator.Config/ConfigBuilder)( [0x7fff8a9a5160: 0.04146 0] LEAVE:c Microsoft.DotNet.HotReload.Utils.Generator.MsbuildConfig:.ctor (Microsoft.DotNet.HotReload.Utils.Generator.Config/ConfigBuilder)( [0x7fff8a9a5160: 0.04437 0] ENTER:c Microsoft.DotNet.HotReload.Utils.Generator.Runner:Make (Microsoft.DotNet.HotReload.Utils.Generator.Config)([Microsoft.DotNet.HotReload.Utils.Generator.MsbuildConfig:0x7fff7c405820]) [0x7fff8a9a5160: 0.04444 1] ENTER:c Microsoft.DotNet.HotReload.Utils.Generator.Config:get_Live ()(this:0x7fff7c405820[Microsoft.DotNet.HotReload.Utils.Generator.MsbuildConfig Microsoft.DotNet.HotReload.Utils.Generator.BuildTool.dll]) [0x7fff8a9a5160: 0.04445 1] LEAVE:c Microsoft.DotNet.HotReload.Utils.Generator.Config:get_Live ()(result=0 [0x7fff8a9a5160: 0.04500 0] LEAVE:c Microsoft.DotNet.HotReload.Utils.Generator.Runner:Make (Microsoft.DotNet.HotReload.Utils.Generator.Config)([Microsoft.DotNet.HotReload.Utils.Generator.Runners.ScriptRunner:0x7fff7c405e60] [0x7fff8a9a5160: 0.04516 0] ENTER:c Microsoft.DotNet.HotReload.Utils.Generator.Runner:Run (System.Threading.CancellationToken)(this:0x7fff7c405e60[Microsoft.DotNet.HotReload.Utils.Generator.Runners.ScriptRunner Microsoft.DotNet.HotReload.Utils.Generator.BuildTool.dll], [00,00,00,00,00,00,00,00,]) [0x7fff8a9a5160: 0.23871 0] LEAVE:c Microsoft.DotNet.HotReload.Utils.Generator.Runner:Run (System.Threading.CancellationToken)([.AsyncStateMachineBox`1:0x7fff7c424560] [0x7fff738cf140: 0.26416 0] ENTER:c Microsoft.DotNet.HotReload.Utils.Generator.Runner:PrepareCapabilities ()(this:0x7fff7c405e60[Microsoft.DotNet.HotReload.Utils.Generator.Runners.ScriptRunner Microsoft.DotNet.HotReload.Utils.Generator.BuildTool.dll]) [0x7fff738cf140: 0.26422 1] ENTER:c Microsoft.DotNet.HotReload.Utils.Generator.Config:get_EditAndContinueCapabilities ()(this:0x7fff7c405820[Microsoft.DotNet.HotReload.Utils.Generator.MsbuildConfig Microsoft.DotNet.HotReload.Utils.Generator.BuildTool.dll])
[0x7fff738cf140: 0.26446 1] ENTER:c Microsoft.DotNet.HotReload.Utils.Generator.EditAndContinueCapabilitiesParser:.cctor ()() [0x7fff738cf140: 0.28896 1] LEAVE:c Microsoft.DotNet.HotReload.Utils.Generator.EditAndContinueCapabilitiesParser:.cctor ()( [0x7fff738cf140: 0.28897 1] ENTER:c Microsoft.DotNet.HotReload.Utils.Generator.EditAndContinueCapabilitiesParser:Parse (System.Collections.Generic.IEnumerable)(XX)
[0x7fff738cf140: 0.28918 2] LEAVE:c Microsoft.DotNet.HotReload.Utils.Generator.EditAndContinueCapabilitiesParser:Tokenize (System.Collections.Generic.IEnumerable)((unknown return type 15)
[0x7fff738cf140: 0.29157 1] ENTER:c Microsoft.DotNet.HotReload.Utils.Generator.Runner:DefaultCapabilities ()(this:0x7fff7c405e60[Microsoft.DotNet.HotReload.Utils.Generator.Runners.ScriptRunner Microsoft.DotNet.HotReload.Utils.Generator.BuildTool.dll])
[0x7fff738cf140: 0.29158 1] LEAVE:c Microsoft.DotNet.HotReload.Utils.Generator.Runner:DefaultCapabilities ()(result=1023
[0x7fff738cf140: 0.29162 1] ENTER:c Microsoft.DotNet.HotReload.Utils.Generator.Config:get_NoWarnUnknownCapabilities ()(this:0x7fff7c405820[Microsoft.DotNet.HotReload.Utils.Generator.MsbuildConfig Microsoft.DotNet.HotReload.Utils.Generator.BuildTool.dll])
[0x7fff738cf140: 0.29163 1] LEAVE:c Microsoft.DotNet.HotReload.Utils.Generator.Config:get_NoWarnUnknownCapabilities ()(result=0
[0x7fff738cf140: 0.29180 0] LEAVE:c Microsoft.DotNet.HotReload.Utils.Generator.Runner:PrepareCapabilities ()(result=1023
[0x7fff738cf140: 0.29196 0] ENTER:c Microsoft.DotNet.HotReload.Utils.Generator.Runner:SetupBaseline (Microsoft.DotNet.HotReload.Utils.Generator.EnC.EditAndContinueCapabilities,System.Threading.CancellationToken)(this:0x7fff7c405e60[Microsoft.DotNet.HotReload.Utils.Generator.Runners.ScriptRunner Microsoft.DotNet.HotReload.Utils.Generator.BuildTool.dll], 1023, [00,00,00,00,00,00,00,00,])
[0x7fff738cf140: 0.29293 1] ENTER:c Microsoft.DotNet.HotReload.Utils.Generator.Runner:InitMSBuild ()()
[0x7fff738cf140: 1.12321 1] LEAVE:c Microsoft.DotNet.HotReload.Utils.Generator.Runner:InitMSBuild ()(
[0x7fff738cf140: 1.12342 1] ENTER:c Microsoft.DotNet.HotReload.Utils.Generator.BaselineProject:Make (Microsoft.DotNet.HotReload.Utils.Generator.Config,Microsoft.DotNet.HotReload.Utils.Generator.EnC.EditAndContinueCapabilities,System.Threading.CancellationToken)([Microsoft.DotNet.HotReload.Utils.Generator.MsbuildConfig:0x7fff7c405820], 1023, [00,00,00,00,00,00,00,00,])
[0x7fff738cf140: 1.12455 2] ENTER:c Microsoft.DotNet.HotReload.Utils.Generator.BaselineProject:PrepareMSBuildProject (Microsoft.DotNet.HotReload.Utils.Generator.Config,Microsoft.DotNet.HotReload.Utils.Generator.EnC.EditAndContinueCapabilities,System.Threading.CancellationToken)([Microsoft.DotNet.HotReload.Utils.Generator.MsbuildConfig:0x7fff7c405820], 1023, [00,00,00,00,00,00,00,00,])
[0x7fff738cf140:] EXCEPTION handling: System.IO.FileNotFoundException:
[0x7fff738cf140: 2.18797 2] LEAVE:c Microsoft.DotNet.HotReload.Utils.Generator.BaselineProject:PrepareMSBuildProject (Microsoft.DotNet.HotReload.Utils.Generator.Config,Microsoft.DotNet.HotReload.Utils.Generator.EnC.EditAndContinueCapabilities,System.Threading.CancellationToken)((unknown return type 15)
[0x7fff738cf140: 2.18885 1] LEAVE:c Microsoft.DotNet.HotReload.Utils.Generator.BaselineProject:Make (Microsoft.DotNet.HotReload.Utils.Generator.Config,Microsoft.DotNet.HotReload.Utils.Generator.EnC.EditAndContinueCapabilities,System.Threading.CancellationToken)((unknown return type 15)
[0x7fff738cf140: 2.18978 0] LEAVE:c Microsoft.DotNet.HotReload.Utils.Generator.Runner:SetupBaseline (Microsoft.DotNet.HotReload.Utils.Generator.EnC.EditAndContinueCapabilities,System.Threading.CancellationToken)((unknown return type 15)
[0x7fff738cf140:] EXCEPTION handling: Newtonsoft.Json.JsonReaderException: Unexpected character encountered while parsing value: C. Path '', line 0, position 0.
[0x7fff738cf140:] EXCEPTION handling: Newtonsoft.Json.JsonReaderException: Unexpected character encountered while parsing value: C. Path '', line 0, position 0.
1<string>)(XX) [0x7fff738cf140: 0.28912 2] ENTER:c Microsoft.DotNet.HotReload.Utils.Generator.EditAndContinueCapabilitiesParser:Tokenize (System.Collections.Generic.IEnumerable
11<string>)((unknown return type 15) [0x7fff738cf140: 0.28940 2] ENTER:c Microsoft.DotNet.HotReload.Utils.Generator.EditAndContinueCapabilitiesParser:Parse (System.Collections.Generic.IEnumerable
1<Microsoft.DotNet.HotReload.Utils.Generator.EditAndContinueCapabilitiesParser/Token>)(XX) [0x7fff738cf140: 0.29011 2] LEAVE:c Microsoft.DotNet.HotReload.Utils.Generator.EditAndContinueCapabilitiesParser:Parse (System.Collections.Generic.IEnumerable1<Microsoft.DotNet.HotReload.Utils.Generator.EditAndContinueCapabilitiesParser/Token>)((unknown return type 15) [0x7fff738cf140: 0.29012 1] LEAVE:c Microsoft.DotNet.HotReload.Utils.Generator.EditAndContinueCapabilitiesParser:Parse (System.Collections.Generic.IEnumerable
1Reproduction Steps
Expected behavior
The dotnet native build should not hang
Actual behavior
The dotnet native build hangs.
Regression?
No response
Known Workarounds
No response
Configuration
No response
Other information
No response