dotnet / msbuild

The Microsoft Build Engine (MSBuild) is the build platform for .NET and Visual Studio.
https://docs.microsoft.com/visualstudio/msbuild/msbuild
MIT License
5.24k stars 1.35k forks source link

Native image cannot be loaded multiple times #750

Closed brthor closed 8 years ago

brthor commented 8 years ago

This error based on offline discussions:

C:\Users\brthor\code\cli\build\package\Microsoft.DotNet.Cli.Layout.targets(64,5): error MSB4062: The "Microsoft.Build.Tasks.Copy" task could not be loaded from the assembly Microsoft.Build.Tasks.Core, Version=15.1.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a. Native image cannot be loaded multiple times Confirm that the <UsingTask> declaration is correct, that the assembly and all its dependencies are available, and that the task contains a public class that implements Microsoft.Build.Framework.ITask. [C:\Users\brthor\code\cli\build.proj]

Creating a standalone repro of this is proving challenging but I have shared a repro offline with @rainersigwald and @cdmihai

Using this issue for tracking workarounds we are making for this issue.

tmeschter commented 8 years ago

@rainersigwald @brthor @cdmihai I've been asked to help out with these MSBuild issues, and I've already picked up #658. This issue looks to be related (at least to the extent that it has to do with assembly loading).

Would it be possible for one of you to share the repro with me?

eerhardt commented 8 years ago

To repro:

tmeschter commented 8 years ago

@eerhardt Thanks!

tmeschter commented 8 years ago

@eerhardt I'm not able to reproduce the issue using the provided steps. With or without the <Message> element in build.proj, the build succeeds.

brthor commented 8 years ago

@tmeschter If the entire MSBuildWorkaroundTarget and references are removed do you see the behavior?

eerhardt commented 8 years ago

Also comment out the Exec call on line 79:

https://github.com/dotnet/cli/blob/feature/msbuild/build.proj#L79

eerhardt commented 8 years ago

The issue seems to occur when the root project doesn't use any built-in tasks (like Message or Exec), and it imports a separate .targets file that uses a built-in task (like Message or Exec).

tmeschter commented 8 years ago

@brthor @eerhardt Still can't repro this, even after removing the MSBuildWorkaroundTarget and all references to it, and removing the Exec call on line 79. Of course, without the Exec call the build fails with an unrelated problem...

brthor commented 8 years ago

Moving this to build/Microsoft.DotNet.Cli.Prepare.targets instead of deleting it might help get the repro.

https://github.com/dotnet/cli/blob/feature/msbuild/build.proj#L75-L80

cdmihai commented 8 years ago

@brthor Can you please make the repro inducing edits in a separate branch and link it here? Would make it easier for multiple people to repro the issue by just checking out a branch and building.

brthor commented 8 years ago

@cdmihai @tmeschter

Can you please make the repro inducing edits in a separate branch and link it here? Would make it easier for multiple people to repro the issue by just checking out a branch and building.

That was the purpose of the offline shared repro. Is that repro not usable for debugging?

tmeschter commented 8 years ago

@brthor Unfortunately VS crashes when trying to load symbols for the copy of Microsoft.Build.dll used by that offline repro. And when I drop in locally-built MSBuild binaries I end up with assembly version mismatches. So at the moment I'm trying to construct a simpler repro with newer bits.

eerhardt commented 8 years ago

It sounded like @anurse had a repro of this bug as well. Maybe he has an separate branch that you can checkout.

analogrelay commented 8 years ago

I don't have a very convenient repro yet, it's tied into our build system which isn't a great standalone example yet.

brthor commented 8 years ago

@tmeschter I've pushed a branch exhibiting the repro.

git checkout brthor/repro750
build.cmd /t:BuildDotnetCliBuildFramework
build.cmd /t:InitializeCommonProps

I had some trouble getting a solid repro, a target in a file that only called built in tasks didn't seem to trigger it. A target calling a custom task followed by the built in task when the target isn't in the top level project did trigger it. I don't know if that info will be useful :smile:

tmeschter commented 8 years ago

@brthor Great; that does reproduce the issue.

The exception occurs while running this command: E:\Projects\cli\.dotnet_stage0\x64\dotnet.exe E:\Projects\cli\.dotnet_stage0\x64\sdk\1.0.0-featmsbuild-003411\MSBuild.exe /p:Architecture=x64 /t:InitializeCommonProps build.proj

However, if I replace the Microsoft.Build*.dll binaries under 1.0.0-featmsbuild-003411 with locally-built bits and run that command again it works.

I notice that the original Microsoft.Build.* binaries under E:\Projects\cli.dotnet_stage0\x64\sdk\1.0.0-featmsbuild-003411 are about twice as large as my locally-built bits. Are these actually NGEN'd bits (despite the lack of a .ni.dll suffix)? If so, how were they produced?

Note that I know almost nothing about the dotnet CLI and its use, so I'll need a certain amount of hand-holding to understand how this is all supposed to work.

brthor commented 8 years ago

Indeed @tmeschter We do crossgen these bits.

They're produced here https://github.com/dotnet/cli/blob/feature/msbuild/build/Microsoft.DotNet.Cli.Compile.targets#L193

Feel free to ping me with any questions on this :smile:

tmeschter commented 8 years ago

@brthor Does the CrossGen task produce .pdbs for these bits? Actually, any suggestions you have on how to debug through these bits would be helpful.

brthor commented 8 years ago

/cc @eerhardt

@tmeschter In the past I haven't needed to do anything special for crossgen'd binaries besides pointing at the pdb in the VS debugger.

Do you have the Pdbs for the binary which is already crossgend? It comes from the msbuild nuget package, but I'm not sure where symbols are published.

brthor commented 8 years ago

/cc @gkhanna79 @johnchen0 Can you provide some guidance here?

The error we're attempting to debug is Native image cannot be loaded multiple times

gkhanna79 commented 8 years ago

Can someone summarize the problem :) ?

brthor commented 8 years ago

@gkhanna79

The problem appears to be that ordering targets which invoke custom/builtin tasks in msbuild in a certain way produces an error that can only be reproduced when the MSBuild binaries are crossgend.

The error we're seeing is posted above but the interesting part is: Native image cannot be loaded multiple times

@tmeschter Is working to debug the crossgend MSBuild bits and looking for any guidance on how to do the debugging there.

tmeschter commented 8 years ago

@brthor This issue doesn't repro if the MSBuild binaries are not crossgend--is that an option for you?

JohnChen0 commented 8 years ago

I did some debugging and found what happened: MSBuild first loaded Microsoft.Build.Utilities.Core.dll using AssemblyLoadContext.LoadFromAssemblyPath. It then tried to load a type from Microsoft.Build.Tasks.Core.dll, which depends on Microsoft.Build.Utilities.Core.dll. While resolving this dependency, the runtime decides to reload Microsoft.Build.Utilities.Core.dll, probably because the previous load was in the wrong context. When Microsoft.Build.Utilities.Core.dll had been compiled by CrossGen, the second load was disallowed, since we don't allow multiple loads of a DLL containing native code, thus generating an error.

@gkhanna79: How is AssemblyLoadContext.LoadFromAssemblyPath expected to interact with other types of assembly loads?

brthor commented 8 years ago

@tmeschter This would reduce the performance of dotnet build. Hopefully we could instead drive the bugfix in crossgen or msbuild (wherever it exists). Disabling it temporarily may be an option.

Thanks for the great info @JohnChen0

tmeschter commented 8 years ago

Based on @JohnChen0's analysis, this seems like an issue with CrossGen or the runtime. At least, it isn't clear to me what MSBuild could or should be doing differently

JohnChen0 commented 8 years ago

We will investigate what can be done from the runtime to prevent this issue, but this will take some time, and we don't yet know what are the possibilities. In the mean time, there are a couple of ways to unblock CLI build:

tmeschter commented 8 years ago

@JohnChen0 What if MSBuild simply used the default AssemblyLoadContext rather than creating a derived type? It seems like that would keep all of the loads in the same context, and effectively avoid the issue.

@AndyGerlicher @rainersigwald Do you know why we're creating our own AssemblyLoadContext rather than simply using the default? We could handle loading dependencies by hooking the AssemblyLoadContext.Resolve event.

JohnChen0 commented 8 years ago

The default AssemblyLoadContext is an abstract class, and thus can't be used directly. If MSBuild only loads assemblies from TPA, then it can rely on runtime's own assembly probing mechanism, and doesn't need to use AssemblyLoadContext. However, I suspect MSBuild needs to load assemblies from arbitrary paths specified by build projects, in which case it needs a derived AssemblyLoadContext. The recommendation is for its implementation of AssemblyLoadContext to distinguish between assemblies on the TPA and those not on the TPA, and only explicitly load assemblies not on the TPA.

tmeschter commented 8 years ago

@JohnChen0 I meant could/should we just use the object returned by AssemblyLoadContext.Default.

And yes, MSBuild needs to load assemblies from arbitrary paths, as well as locate those assemblies' dependencies.

Also, could you define "TPA"?

gkhanna79 commented 8 years ago

Modify MSBuild's implementation of AssemblyLoadContext.Resolve

Not Resolve event but Load override of custom AssemblyLoadContext should return null to allow the TPA (DefaultContext) binder to perform the bind.

@tmeschter Think of TPA (Trusted Platform Assembly) list as the set of binaries that constitute the app (and thus, are referred to by their respective package entries in app.deps.json).

If all the assemblies that need to be loaded are known ahead of time, then ensure that they are part of the app and listed in app.deps.json. This will make them always be loaded from a single (Default) loadcontext.

If you do not know their path ahead of time BUT know that the ones that need to be loaded from a single context, then in your custom AssemblyLoadContext's Load override, check for the assembly and load it in the default Context using AssemblyLoadContext.Default.LoadFromAssemblyPath.

tmeschter commented 8 years ago

@gkhanna79 Thanks; that is good to know.

Let me ask a more general question: under what circumstances would it make sense to derive from AssemblyLoadContext rather than simply use AssemblyLoadContext.Default and hook the Resolve event to help find assemblies?

gkhanna79 commented 8 years ago

AssemblyLoadContext is about providing isolation to load same or different versions of a given assembly at runtime. Typically, developers do not have to worry about this. However, when dealing with scenarios like Plugin infrasturcture, the plugin infra needs to think about such issues and use AssemblyLoadContext to provide isolation for such loads.

Resolving event is a mechanism to enable a given load context have a fallback mechanism to point to assembly if one is not already loaded within the load context.

The specific issue you are running into is about attempting to load a given SharedFX assembly twice - this should not be required to be done since SharedFX is already part of the DefaultContext. Thus, you should consider returning null from the Load override.

tmeschter commented 8 years ago

@gkhanna79 Can you define "SharedFX"? Is it the same as "TPA"?

How do I know if a given assembly is part of the SharedFX and/or TPA?

gkhanna79 commented 8 years ago

Yes, it is the same as TPA.

You can determine the path to the SharedFX and check if the assembly exists there or not. If not, then this is an app specific assembly and the Load override should know how to load it.

tmeschter commented 8 years ago

Is there an API to get the path to the SharedFX?

eerhardt commented 8 years ago

Is there an API to get the path to the SharedFX?

The way we get it in the CLI is using this code:

https://github.com/dotnet/cli/blob/feature/msbuild/src/Microsoft.DotNet.Cli.Utils/Muxer.cs#L45

        public static string GetDataFromAppDomain(string propertyName)
        {
            var appDomainType = typeof(object).GetTypeInfo().Assembly?.GetType("System.AppDomain");
            var currentDomain = appDomainType?.GetProperty("CurrentDomain")?.GetValue(null);
            var deps = appDomainType?.GetMethod("GetData")?.Invoke(currentDomain, new[] { propertyName });

            return deps as string;
        }

        private bool TryResolveMuxerFromParentDirectories()
        {
            var fxDepsFile = GetDataFromAppDomain("FX_DEPS_FILE");
            if (string.IsNullOrEmpty(fxDepsFile))
            {
                return false;
            }

            var fxDir = new FileInfo(fxDepsFile).Directory
brthor commented 8 years ago

Worth noting that @eerhardt 's posted solution requires you to be using the dotnet.exe host, and not corerun.

eerhardt commented 8 years ago

and not corerun.

I didn't think corerun supports the Shared FX at all. So yes, it can't can't be used with the code above.

tmeschter commented 8 years ago

My take away from all this is that there's basically no way to correctly implement an AssemblyLoadContext without some rather deep knowledge of the runtime environment.

If an AssemblyLoadContext should always/generally avoid loading an assembly that is part of the "SharedFX" and instead let the default context handle it, then that should be the default behavior. My AssemblyLoadContext shouldn't even be asked about it unless it opts out of that default.

Further, since the application gets to decide whether or not to crossgen assemblies, my library shouldn't have to think about them at all when loading an assembly.

At this point I think I'm probably going to remove MSBuild's AssemblyLoadContext in favor of using whatever is returned by AssemblyLoadContext.Default and hooking its Resolve method.

brthor commented 8 years ago

Is there any way general way to probe the TPA @gkhanna79 @JohnChen0 ?

Only looking in the sharedfx directory seems like it could be fragile in the long term, also doesn't the host also place the assemblies in the app's deps.json on TPA?

gkhanna79 commented 8 years ago

Do you want to chat before you make the change @tmeschter? It will be good to have context on why MSbuild had one in the first place and figure out the right next steps.

Also, you may run into https://github.com/dotnet/coreclr/issues/5837 with Resolving event, so lets chat on how you can achieve your intended goal.

gkhanna79 commented 8 years ago

Is there any way general way to probe the TPA @gkhanna79 @JohnChen0 ?

Unless TRUSTED_PLATFORM_ASSEMBLIES property is published (like FX_DEPS_FILE), then no.

tmeschter commented 8 years ago

@gkhanna79 MSBuild more or less copied the one from Roslyn: https://github.com/dotnet/roslyn/blob/d6f8dc441a1a3b749b5082579f582892236b1584/src/Compilers/Helpers/CoreClrAnalyzerAssemblyLoader.cs

However, Roslyn no longer implements an AssemblyLoadContext either. It now simply grabs the existing one and hooks the Resolving event.

The MSBuild requirements are pretty simply. We need to be able to load assemblies from arbitrary file paths. We also need to be able to load their dependencies, which we assume are located immediately next to them (or are supplied by the app/system, in which case we shouldn't have to think about them at all). We don't care about isolating one "plug-in" from another.

gkhanna79 commented 8 years ago

@tmeschter Roslyn is actually creating its own LoadContext (see https://github.com/dotnet/roslyn/blob/56f605c41915317ccdb925f66974ee52282609e7/src/Scripting/Core/Hosting/AssemblyLoader/CoreAssemblyLoaderImpl.cs#L37) and using its Load override to override what the DefaultContext may return.

We need to be able to load assemblies from arbitrary file paths

Do you know these paths ahead of time? Also, are you going to run against .NET Core RTM or using the CoreCLR that comes from CoreCLR master (since that will determine whether you need to workaround the Resolving bug or not)?

tmeschter commented 8 years ago

@gkhanna79 That implementation of AssemblyLoadContext is solely for scripting scenarios (e.g. csi.exe/C# Interactive Window); it is not used in the C# or VB compilers themselves.

The paths are not known ahead of time.

I don't know if we're going to run against .NET Core RTM or the CoreCLR from CoreCLR master.

Would this be another option: have our AssemblyLoadContext forward all load requests the default context, and only try to resolve an assembly if the default context can't? I'm not sure how that would differ from simply hooking the Resolving event on the default context, but maybe there's something I'm missing.

gkhanna79 commented 8 years ago

have our AssemblyLoadContext forward all load requests the default context, and only try to resolve an assembly if the default context can't?

This sounds reasonable approach, by doing something like AssemblyLoadContext.Default.LoadFrom* in the Load override of your LoadContext implementation.

tmeschter commented 8 years ago

@gkhanna79 One issue with that approach is that I need to catch all exceptions that arise from AssemblyLoadContext.Default.LoadFrom* and ignore them, and then move on to checking the paths specific to my loader. Doable, but a bit ugly.

I also tried simply hooking the Resolving event of the AssemblyLoadContext.Default instance, with the handler locating the assembly and then calling AssemblyLoadContext.Default.LoadFromAssemblyPath. This works to the extent that I can find and load the assembly, but then immediately dies with an ExecutionEngineException as soon as we try to execute any code in the new assembly. Any idea about that one?

tmeschter commented 8 years ago

@brthor Assuming I can produce MSBuild bits with a fix, how should I go about updating the repro to test those new bits?

brthor commented 8 years ago

Patching the msbuild binaries under stage0 that gets downloaded as a part of the build under .dotnet_stage0/sdk/... should do it.

They'll also need to be crossgen'd which means you'll need to run that tool. Alternatively we can wait for cli to pick up the new version and the stage0 will get the latest bits.