AaronRobinsonMSFT / DNNE

Prototype native exports for a .NET Assembly.
MIT License
406 stars 41 forks source link

Question: Avoid IsolatedComponentLoadContext ALC #115

Closed matthiasnissen closed 2 years ago

matthiasnissen commented 2 years ago

If I have a .Net 6 application A that calls a native component B via PInvoke, which in turn calls a .Net assembly C with DNNE, this assembly is loaded into its own "IsolatedComponentLoadContext". Init_fpr returns Success_HostAlreadyInitialized in this case and the documentation says:

"Initialization was successful, but another host context is already initialized, so the returned context is "secondary". The requested context was otherwise fully compatible with the already initialized context."

Therefore, it seems to be the expected behavior. Is there a way to make DNNE use the default ALC for C here? If not, is there another technique known to achieve this?

The hosting application A is not supposed to know assembly C here and therefore cannot pass a function pointer of C to the component B.

Motivation: Assembly C may also be loaded into the default ALC by A's dependency graph. C should contain here e.g. global data (Singleton in C) or be able to access these, this is not the case with two instances of C in different ALCs. A is to be cross-platform compatible, therefore C++/CLI and COM are no solution.

Thanks a lot!

AaronRobinsonMSFT commented 2 years ago

@matthiasnissen This is a very common question. The .NET team is struggling with a lot of constraints here and the right solution is difficult to mentally understand as we have competing interests. The biggest interest here is predictability and debuggability for assembly loading. The ALC issues for DNNE are identical to those with C++/CLI (see behavioral change here) and COM as well. The best issue tracking this problem is at https://github.com/dotnet/runtime/issues/13472.

A mitigation would be to have application A pass down a function pointer that could be used by C to load its dependencies. This would require C to have indirect use of a new assembly, D. Assembly D would contain all the global state that isn't being shared in your current scenario.

Perhaps @elinor-fung or @vitek-karas have other insight. @agocke for visibility.

matthiasnissen commented 2 years ago

@AaronRobinsonMSFT Thank you very much for your comments. I have tried to understand and implement your mitigation. To do this, I created a method in A that loads a specified assembly into the default ALC. Then, using the C#9 function pointer concept, I passed the pointer to this method through B to C. In C, which is in the IsolatedComponentLoadContext, I now have the dependent assembly D loaded by the function pointer of A into the default ALC. C contains a normal reference to D in my case. If I now call a method in C that uses D after the explicit loading of D by A, D is still loaded into the isolated context. How should the indirect use of D by C suggested by you now take place?

AaronRobinsonMSFT commented 2 years ago

@matthiasnissen Great progress so far. I believe you will also now need to update the ALC that contains C to defer to the default ALC when loading D. I may need to confer with the experts here, but you should be able to use https://docs.microsoft.com/dotnet/api/system.runtime.loader.assemblyloadcontext.getloadcontext and https://docs.microsoft.com/dotnet/api/system.runtime.loader.assemblyloadcontext.resolving relative to C to control how D is loaded.

matthiasnissen commented 2 years ago

@AaronRobinsonMSFT Ok, I have now registered in C for the resolving event of the isolated context and would now search for the corresponding assembly in the default context and return an assembly found there. However, my code is not called yet, because the assembly D can be resolved normally by the runtime for the isolated context. How can I avoid that the runtime is successful here and my code is called?

AaronRobinsonMSFT commented 2 years ago

@matthiasnissen I chatted with the ALC experts and here are some thoughts. Looks like my above guidance was a bit off, apologies.

I suspect the user is correct that they do not have an opportunity to make the isolated ALC fallback to the default. I expect the deps.json for their component C has dependency D in it, so the isolated ALC finds D and loads it in the isolated ALC (so the Resolving event is not fired).

There are ways to make references not show up in the deps.json (Private=false, I think?), which is essentially what we recommend for shared dependencies of managed plugins (without native hosting involved).

https://docs.microsoft.com/dotnet/core/tutorials/creating-app-with-plugin-support#simple-plugin-with-no-dependencies has paragraph about Private and ExcludeAssets for managed plugins.

Something like that would mean the DNNE component would still be in the isolated ALC, but any dependencies (that are excluded from its deps.json) would fall back to the default ALC. Not sure if that is the desire/appropriate here.

elinor-fung commented 2 years ago

Yeah, as you found, if the isolated context can successfully resolve an assembly, the Resolving event will not be fired.

the assembly D can be resolved normally by the runtime for the isolated context

This is based on C's deps.json file. The isolated context will an AssemblyDependencyResolver in order to load dependencies of C.

When building an assembly, it is possible to explicitly exclude some dependency from the generated deps.json by specifying Private on the reference item. This is what we recommend for managed plugins (without a native component or DNNE involved) that have shared dependencies that should live in the default ALC.

matthiasnissen commented 2 years ago

Many thanks to @AaronRobinsonMSFT and @elinor-fung . I'll try to outline the topic as a whole and I'll expand a bit, because COM and C++/CLI are also important for us and DNNE was an object of study for me regarding cross platform interop.

There are various hosts:

The following statements now only refer to comhost, ijwhost and DNNE because we are talking about native/managed interoperability: All of these use hostfxr to locate the .NET Core runtime and initialize and start the runtime afterwards if this has not been done otherwise. If the runtime is already loaded and initialized, only the compatibility is checked. In a next step all of these load the assembly/managed part to an isolated ALC (as described in IJW-activation or COM-activation and as observed for DNNE)

Changes have been made to ijwhost that as of .NET 7 the managed part is loaded into the default ALC. (Changes to ijwhost and InMemoryAssemblyLoader: pull) Are there plans to make similar behavior changes for comhost and DNNE?

In the last discussion contributions it was now about the fact that the dependency D of the assembly C is also loaded into the isolated ALC and how it can be achieved that this is loaded into the standard ALC. It is loaded to the isolated ALC, because it is listed in Cs deps.json and the AssemblyDependencyResolver resolves it correctly. Setting <Private>false</Private> for Cs reference to D removes D from the deps.json. There seem to be two different ways to get D loaded to the default ALC:

Does <Private>false</Private> do anything other than remove D from Cs deps.json? Otherwise, for given assemblies of C's type, a post-processing step could cut the portions from their deps.json and paste them into A's deps.json.

Apparently the second approach can be improved by registering centrally on the defaults ALC for Resolving. Default ALC Resolving seems to be called before isolated ALC Resolving. I have tried this, is this officially documented?

vitek-karas commented 2 years ago

<Private>false</Private> doesn't do anything else (as far as I know), we've been recommending it as the solution to these types of problems for a while and it seems to work for everybody.

The assembly resolution algorithm is described here: https://docs.microsoft.com/en-us/dotnet/core/dependency-loading/loading-managed#algorithm. If you do find holes in it, please let us know so that we can improve the doc.

AaronRobinsonMSFT commented 2 years ago

@matthiasnissen Since this is a dotnet design rather than a DNNE issue, I am going to close this issue. If we update the dotnet hosting model with APIs that can enable this, I will fold that support in. We should continue the conversation in the dotnet/runtime repo at https://github.com/dotnet/runtime/issues/66013 or https://github.com/dotnet/runtime/issues/59546.