pythonnet / pythonnet

Python for .NET is a package that gives Python programmers nearly seamless integration with the .NET Common Language Runtime (CLR) and provides a powerful application scripting tool for .NET developers.
http://pythonnet.github.io
MIT License
4.64k stars 703 forks source link

In multi-app-domain setup exceptions does not contain any information #617

Open JanKrivanek opened 6 years ago

JanKrivanek commented 6 years ago

Environment

Details

Following code demonstrates that PythonExceptions do not contain any information if engine initialization happened on different app domain. type and message in PythonException ar null, and so the exception message is just " : " - example contains failing assert for this:

[TestFixture]
public class ExceptionTest
{
    [Test]
    public void ExceptionIsEmpty()
    {
        PythonEngine.Initialize();
        IntPtr ctx = PythonEngine.BeginAllowThreads();

        ExecuteInNewAppDomain();

        PythonEngine.EndAllowThreads(ctx);
        PythonEngine.Shutdown();
    }

    public static void ExceptionForcingMethod()
    {
        try
        {
            IntPtr lck = PythonEngine.AcquireLock();
            PythonEngine.ImportModule("ThisDoesNotExist");
            PythonEngine.ReleaseLock(lck);
        }
        catch (Exception e)
        {
            Console.WriteLine(e);
            Assert.AreNotEqual(e.Message, " : ");
            throw;
        }

    }

    public static void ExecuteInNewAppDomain()
    {
        AppDomain domain = null;

        try
        {
            var domaininfo = new AppDomainSetup();
            //this is needed as unit tests shadow copy the binaries
            domaininfo.ApplicationBase = System.Environment.CurrentDirectory;
            domaininfo.LoaderOptimization = LoaderOptimization.MultiDomain;
            Evidence adEvidence = AppDomain.CurrentDomain.Evidence;

            domain = AppDomain.CreateDomain("New App Domain: " + Guid.NewGuid(), adEvidence, domaininfo);
            domain.DoCallBack(ExceptionForcingMethod);
        }
        catch (Exception e)
        {
            Console.WriteLine(e.ToString());
            throw;
        }
        finally
        {
            if (domain != null)
                AppDomain.Unload(domain);
        }
    }
}

Are there any ideas what can be causing this?

JanKrivanek commented 6 years ago

OK so I was able to track that down

Explanation: avoiding double call to Initialize (which would cause exceptions on attempt to twice initialize native part of context - that's shared accross appdomains) caused that some managed types were not initialized in other appdomains - one of them being PyUnicodeType, and that caused unability to convert strings returned from python code.

Solution suggestion: The native and managed part of initialization should be separated, The native part should be called just once per process and the managed for each appdomain. So there should be simple static field (as it is today) guarding the managed portion of initialization and cross-app-domain singleton guarding the native part of initialization. This would address https://github.com/pythonnet/pythonnet/issues/595 as well

Will try to provide PR some time later

JanKrivanek commented 6 years ago

This actually turns out much more complicated :-/ Everything works fine with intitialization per app-domain - unless I start unloading old app domains - than I'm getting System.AppDomainUnloadedException when executing simple calls like:

op = PyImport_ImportModule("builtins");

Not sure why this happens - as I'm not touching anything on previous app domain (but maybe som pythonnet hooks do? But nothing further is shown on the stack - so it's unlikely).

I tried to workaround by sharing the IntPtr pointrs returned by PyImport_ImportModule in initialization across app domains and so avoid subsequent calls (as they would still return same pointer). This required quite bit of exploring and rework, but did work. However same exception occured later on on some of the python->pythonnet callback. Specifically on Runtime.PyObject_Call in ImportHook.import method.

I'm trying to dig in, but due to shorter allocation of time to this task I'll probably end up using it without unloading old appDomains - which is scenario that's not suiteble for all use-cases (eg. in IIS app domains are created/unloaded without ability to influence this)

JanKrivanek commented 6 years ago

I again found the culprit: ImportHook.Initialize registers self as a hook for __import call from python. Any appdomain that executes this initialization will be the one on the hook when __import is called and so when that appdomain is unloaded (without deregistering the hook) it will cause exception when python is trying to call the __import__. Unfortunately .NET throws the AppDomainUnloadedException before even putting the called method on the stack - so it's impossible to see what is actually attempted to be called and causing AppDomainUnloadedException :-/ So I need to peel an onion to see what all callbacks are being registered on initialization - to make sure those gets cleared befor appdomain unload (or not even registered for child domains). If you'd have any other ideas what else is registered - I'd appreciate it (assembly resolving is safe - as it is explicitely registered for current appdomain events)

dmitriyse commented 6 years ago

Probably this commit can be helpful to figureout what also should be cleaned up before domain unload. https://github.com/pythonnet/pythonnet/pull/532/commits/9de6f7bd071f60c67b42847f33453c0d5bd61c7a

den-run-ai commented 6 years ago

@jakrivan @dmitriyse maybe the out of process communication makes more sense for this? Seems like too much of work in pythonnet to get this working.

JanKrivanek commented 6 years ago

@dmitriyse Thanks - that might also help. @denfromufa - We'll have data heavy processing, so avoiding x-proc transfer will help. I'll see if making sure to initialize hooks only for parent app domain helps.

JanKrivanek commented 6 years ago

@denfromufa - Well it turns out that out-of process will be the best approach right now.

I made sure that Importhook is called just once (in parent app domain) - that solved the simle test case. But I was still getting snapped by AppDomainUnloadedException in more complicated cases (e.g. where python code was explicitly dynamically loading types etc.). And it's extremely difficult to trobleshoot because AppDomainUnloadedException is thrown prior the offending code gets on stack. I even tried in native debugger - to be able to grab the stack pointer to what was attempted to be called - but there was nothing on the stack between python and KernelBase.dll!RaiseException() - so no clue in hunting what else is hooked and called from python.

LSS - I surrender for now. X-Process solution is going to be order of magnitude times easier:)

den-run-ai commented 6 years ago

@jakrivan for the data heavy processing you can use efficient RPC or shared memory. I use multi-processing solution whenever it is hard to separate some global state, e.g. COM objects. Luckily my case is not very data intensive.

den-run-ai commented 6 years ago

or use multiple processes to communicate between 32-bit and 64-bit processes like in https://github.com/MSLNZ/msl-loadlib