dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.27k stars 4.73k forks source link

P/Invoking a method that throws a .net exception from within a natively invoked callback crashes the running application with SIGABRT on Linux #4756

Closed jumpinjackie closed 4 years ago

jumpinjackie commented 8 years ago

Moved from: https://github.com/aspnet/dnx/issues/3239

I'm currently looking at porting the .net API for MapGuide over to run on ASP.net 5/CoreCLR in a cross-platform scenario

The code itself is in C++, but we use SWIG to generate a .net interop layer around our C++ code (which basically creates a glue library of flattened extern "C" functions and generates a whole series of .net proxy classes and P/Invoke method stubs on the .net side). I've been able to get this interop layer to work for the CoreCLR in both Windows and Linux, but I've hit a show-stopping snag with regards to exceptions.

Our MapGuide API can throw exceptions in various places, and SWIG will generate code such that when such exceptions are thrown on the native side they are caught (so they never cross the native boundary), the essential information in the exception is extracted and passed back into a callback on the .net side where an equivalent .net exception proxy class is created with the captured information and re-thrown from there.

While this works as it always has on Windows (any exception in the native code of the MapGuide API is caught, processed and rethrown as a .net exception), on Linux (Ubuntu 14.04 64-bit with the CoreCLR 1.0.0 rc1-update1 binaries) it crashes the running application with SIGABRT on the (rethrow on .net side) step.

While posting the SWIG-generated code and .net proxy classes is a bit too complex to deconstruct (we are wrapping several hundred C++ classes!), I have been able to somewhat distill the problem down to an easy simple and reproducible test application that's available in the link below.

https://github.com/jumpinjackie/coreclr-pinvoke-crash-repro

Is what is illustrated in the above test application a supported scenario?

janvorli commented 8 years ago

@jumpinjackie CoreCLR on Unix doesn't support throwing exceptions over PInvoke and reverse PInvoke boundaries. So exception thrown from the managed callback cannot pass the frames of the native code that called the callback and end up being caught in the managed code that PInvoked the native code. This is mostly due to the complexities of exception handling on Unix. Since on Unix we cannot register unwind info from the jitted code and let the native unwinder use it, we have to use our own unwinder for managed code frames and the C++ unwinder for native code. Basically, we walk frames in the managed code and call ProcessCLRException for each managed frame until we come to the first native frame. Then we throw C++ exception and let the standard C++ unwinder do its job. We always have a special catch handler at the boundary of the next managed frame so the exception is always caught at that boundary if it was not caught somewhere else in the native code. But we only have these special catch handlers in the CoreCLR runtime code. We cannot put them to the boundary where managed code PInvokes the native code, since there are no native frames of ours where we could put the special catch handler. That means that if you throw exception from the managed callback, is gets converted to native C++ throw that goes through your native code and then crashes badly when the native C++ unwinder enters managed code frames and doesn't know how to unwind them.

jumpinjackie commented 8 years ago

@janvorli So if we find an alternative way to transmit the exception information back to .net that is not (throwing a .net exception in a reverse PInvoke callback), it should be fine?

janvorli commented 8 years ago

@jumpinjackie Right, then there should be no problem.

jumpinjackie commented 8 years ago

This is an acceptable workaround for me