Open msedi opened 2 weeks ago
Tagging subscribers to this area: @vitek-karas, @agocke, @vsadov See info in area-owners.md if you want to be subscribed.
As far as I remember AppDomains provided a better isolation
Each AppDomain had a full copy of all statics (including CoreLib statics). Is that what you are asking for?
Once you create a full copy of all statics, exchanging types between the different domains becomes impossible. All calls between the different domains have to be marshalled. Once you have to marshal all calls between the different domains, it is much easier to just use processes as the isolation boundary.
@jkotas:
Once you create a full copy of all statics, exchanging types between the different domains becomes impossible
I think in the end this might be the consequence. It's not that I want AppDomains back ;-) I just mentioned that there was at least an isolation which I'm not able to get back without efforts using ALCs.
My problem is twofold (as we spoke a little in #102981).
Originally we had the secondary processes not as processes but only as normal task/jobs that were started. This was problem with logging, since we wanted to isolate processes regarding logging, but the logging framework had statics in it, which made it a problem to isolate the logging. Additionally, we wanted to isolate the tasks because of resource management. Each jobs has a total RAM consumption of around 50TB during its liefetime using a lot of unmanaged interop code and CUDA/GPU resources. Each job should start as clean as possible and to have no leftovers from the former job. So we came up using processes to really isolate the job which makes a lot of sense in terms of resource management. The host and the job are communicating via GRPC.
The problem now is that debugging is not as easy as before and it seems Visual Studio has capabilities to attach to another process, but thats too manual.
So in the end, we need some way to get back to the original state where debugging was easy, so we thought the ALC might be the best solution, but obviously is missing some AppDomain features we doidn't think about. In production we still use the external processes because we don't need to debug there.
I just mentioned that there was at least an isolation which I'm not able to get back without efforts using ALCs.
Yes, that's expected. ALCs are cooperative unloading. If there are components that do not cooperate in the scheme, ALCs are not going to work. I do not see how this can be fixed without bringing full AppDomains back.
If the logging framework does not cooperate with ALCs, it needs to be fixed in the logging framework.
If the logging framework does not cooperate with ALCs, it needs to be fixed in the logging framework.
The logging framework was just an example. There are many libraries we use that have static things that are only evaluated once and then store in a static readonly field.
It is hard to discover which libraries use it and the only chance is to run the program and check for errors.
How do you deal with external processes that you need to debug? Maybe there's a better way I don't know of.
There are many libraries we use that have static things that are only evaluated once and then store in a static readonly field.
You can load these libraries multiple times in separate ALCs.
You can load these libraries multiple times in separate ALCs.
Right, but it is hard to find out which libraries can cause problems and which I need to load. In the end, because I don't know, I need to load maybe the whole dpendency tree of the libraries to work around this.
Have you seen https://marketplace.visualstudio.com/items?itemName=vsdbgplat.MicrosoftChildProcessDebuggingPowerTool ?
Yes, I have tried it already, but its a bit hard to configure and to roll it out in an enterprise environment. I can check again. Currently people are able to start working directly after cloning without having to set up the environment.
Having this extension (which I cannot even force that people need to install), there will be a lot of tickets in our support team from people reporting issues with the extension ;-)
Addendum: I have tried again teh child process debugger and there now seems to be a possibility to store the config not in the suo file but in a separate setting. So I will try again.
What I have been doing is, hold the ALC in a weak reference and run GC a couple times after calling Unload()
, and if the weak reference is still alive, emit a warning. You could take it further and capture a memory dump of the process to help you with diagnosing the unloadability problems.
In the end, because I don't know, I need to load maybe the whole dpendency tree of the libraries to work around this.
I believe sharing the minimum necessary set of assemblies is indeed the best way to go and that's what I am doing.
Background and motivation
Coming from these two discussions: #69899 and #102981.
Currently the AssemblyLoadContext (ALC) has no complete isolation from the default ALC. In detail, if you have static fields in your default ALC (e.g. logging), the user-created ALC "inherits" already existing static fields.
The get around this you have to load the assembly that contains the static field explicitly in the user-created ALC . The problem is that it is mostly unknown which assemblies have static fields and one has to find out until every assembly needs to be loaded until it works.
As far as I remember AppDomains provided a better isolation. So my idea would be to introduce an IsolcationLevel and the ALC does take care itself of isolating things depending on the level. A bool might be sufficient, but maybe there is a need for a more fine-grained control of the isolation
API Proposal
API Usage
Alternative Designs
I currently don't know a better design.
Risks
I do not know about the side-effects.