dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.45k stars 4.76k forks source link

[NativeAOT Library] Exceptions thrown from `[UnmanagedFunctionPointer]` delegate can't be caught in `[UnmanagedCallersOnly]` method #97952

Closed ceztko closed 9 months ago

ceztko commented 9 months ago

Description

Exceptions thrown from [UnmanagedFunctionPointer] delegates are not caught by try-catch block in outer [UnmanagedCallersOnly] NAOT compiled method. The issue looks critical to me as libraries may use this strategy for error handling (it's legal with CLR EDIT: only in Windows) and it's currently preventing me from using a large private library with native components in Native AOT scenarios.

A sample showing the issue using the above strategy for error handling in native libraries (C++ in the sample) is attached. The solution is composed by the following projects:

Reproduction Steps

Expected behavior

I'm expecting exceptions thrown from [UnmanagedFunctionPointer] marked delegates in AOT compiled code to behave in a similar way they do in CLR, where the native stack gets unwind up to the [DllImport] boundary and the System.Exception regularly propagates in the managed stack. Because in this case everything gets natively compiled, the exception should just propagate to the the [UnmanagedCallersOnly] marked method in the NAOTLibrary project and be caught there.

Actual behavior

The exceptions thrown from [UnmanagedCallersOnly] marked delegates in AOT compiled code are unhandled and the process quits with __fastfail(), even if an outer try-catch exists.

Regression?

No response

Known Workarounds

The only workaround I can imagine at the the moment is to not throw exceptions in[UnmanagedFunctionPointer] delegates and rely to classical C style return error codes.

Configuration

No response

Other information

.NET 8.0.101 (Visual Studio 2022 17.8.6) Windows 10 x64

I'm interested in testing also the other NAOT supported platforms, such as linux and macos.

MichalPetryka commented 9 months ago

Does it work if you use UnmanagedCallersOnly instead of UnmanagedFunctionPointer?

MichalPetryka commented 9 months ago

it's perfectly legal with CLR

Worth noting that passing exceptions through native code is Windows only (and even there can break a lot of native code that's not prepared for it) and guaranteed to fail-fast on other platforms.

ceztko commented 9 months ago

Does it work if you use UnmanagedCallersOnly instead of UnmanagedFunctionPointer?

This suggestion does not apply: UnmanagedFunctionPointer can be applied only to delegates, UnmanagedCallersOnly can be applied only to methods. The description of the sample solution above (or the supplied code, of course) should better clarifying what I'm doing.

ceztko commented 9 months ago

it's perfectly legal with CLR

Worth noting that passing exceptions through native code is Windows only (and even there can break a lot of native code that's not prepared for it) and guaranteed to fail-fast on other platforms.

Good point: it's not something I tried recently (I am curious and I will test it later in linux) but yes, if the runtime can't unwind the native stack in other platforms as it's possible in Windows, then fail-fast is the most safe behavior. Still I believe when everything gets NAOT compiled this possible limitation may not exist.

It is worth noting that in the NAOT scenario I also tried compiling my C++ library /EHa (catches structured exceptions) as I did with CLR wrappers, but the error is still the same.

jkotas commented 9 months ago

Interop with unmanaged exception handling is supported only by regular CoreCLR and only on Windows. It is not supported by native AOT.

The recommended portable solution is https://github.com/dotnet/runtime/issues/35017#issuecomment-614247258

ghost commented 9 months ago

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas See info in area-owners.md if you want to be subscribed.

Issue Details
### Description Exceptions thrown from `[UnmanagedFunctionPointer]` delegates are not caught by `try-catch` block in outer `[UnmanagedCallersOnly]` NAOT compiled method. The issue looks critical to me as big libraries may use this strategy for error handling (it's perfectly legal with CLR) and it's currently preventing me from using a large private library with native components in Native AOT scenarios. A sample showing the issue using the above strategy for error handling in native libraries (C++ in the sample) is attached. The solution is composed by the following projects: - A C++ compiled library `CppNativeLibrary` that exports `FooNativeLibrary()` and provides customizable error handling through the `SetErrorHandler()` export; - An AOT compatible `NetNativeLibraryWrapper` C# library that wraps `CppNativeLibrary` through P/Invoke and defines a `[UnmanagedFunctionPointer]` marked handler that install the error handler for the C++ library. The handler acutally throws a `System.Exception`; - A C# Native AOT published library `NAOTLibrary` that exports a `[UnmanagedCallersOnly]` marked `FooNAOT()` method; - A `TestExceptionThroughCallbackNAOT` C# CLR project that tests the `FooNAOT()` method in `NAOTLibrary` through P/Invoke. ### Reproduction Steps - Download and unzip [TestExceptionThroughCallbackNAOT.zip](https://github.com/dotnet/runtime/files/14158734/TestExceptionThroughCallbackNAOT.zip) - Open `TestExceptionThroughCallbackNAOT.sln` and compile the solution (re-build in case the executable complains with a "You must install or update .NET to run this application" error) - Debug `TestExceptionThroughCallbackNAOT` project, the following unhandled exception pop-up should be shown: ![Immagine 004](https://github.com/dotnet/runtime/assets/3037449/2839252a-3534-4db8-a213-cbc52b0b60d3) ### Expected behavior I'm expecting exceptions thrown from `[UnmanagedFunctionPointer]` marked delegates in AOT compiled code to behave in a similar way they do in CLR, where the native stack gets unwind up to the `[DllImport]` boundary and the `System.Exception` regularly propagates in the managed stack. Because in this case everything gets natively compiled, the exception should just propagate to the the `[UnmanagedCallersOnly]` marked method in the `NAOTLibrary` project and be caught there. ### Actual behavior The exceptions thrown from `[UnmanagedCallersOnly]` marked delegates in AOT compiled code are unhandled and the process quits with `__fastfail()`, even if an outer try-catch exists. ### Regression? _No response_ ### Known Workarounds The only workaround I can imagine at the the moment is to not throw exceptions in`[UnmanagedFunctionPointer]` delegates and rely to classical C style return error codes. ### Configuration _No response_ ### Other information .NET 8.0.101 (Visual Studio 2022 17.8.6) Windows 10 x64 I'm interested in testing also the other NAOT supported platforms, such as linux and macos.
Author: ceztko
Assignees: -
Labels: `untriaged`, `area-NativeAOT-coreclr`, `needs-area-label`
Milestone: -
ceztko commented 9 months ago

It is not supported by native AOT.

Can you provide more insights why this limitation should apply to this specific use case of NativeAOT, as in my sample? In the boundary of the Native AOT generated shared library, everything gets natively compiled and there's should be only one (I guess) native stack. The exception thrown in the [UnmanagedFunctionPointer] delegate should be compatible to the type of exceptions that are to be caught in the try-catch block in the [UnmanagedCallersOnly] method, so in the end we are just throwing a brand new exception in a callback, while already catching an internal exception. If everything was coded in a single language, eg. C++, my Native AOT sample would really turn to be something like the snippet below, which is perfectly fine in any platform:

#include <functional>
#include <string_view>
#include <stdexcept>

using namespace std;

namespace MyLib
{
    void SetExceptionHandler(const function<void(string_view message)>& handler);

    void Foo();
}

static void fooInternal();

static function<void(string_view)> s_errorHandler;

int main()
{
    MyLib::SetExceptionHandler([](string_view message)
        {
            throw runtime_error(message.data());
        });

    try
    {
        MyLib::Foo();
    }
    catch (exception& ex)
    {
        cerr << "ERROR: " << ex.what() << endl;
        return 1;
    }

    return 0;
}

namespace MyLib
{
    void SetExceptionHandler(const function<void(string_view message)>& handler)
    {
        s_errorHandler = handler;
    }

    void Foo()
    {
        try
        {
            // Guard for internal exceptions
            fooInternal();
        }
        catch (exception& ex)
        {
            s_errorHandler(ex.what());
        }
        catch (...)
        {
            s_errorHandler("Unknown error");
        }
    }
}

void fooInternal()
{
    // Throwing an internal exception
    throw runtime_error("An error occurred");
}

So, why the Native AOT code should fast-fail?

MichalStrehovsky commented 9 months ago

I'm interested in testing also the other NAOT supported platforms, such as linux and macos.

The catch (...) part of your sample assumes the runtime would be able to convert the managed exception into a C++ exception that is catchable from C++. The throwing code is not C++. The runtime doesn't even link against the standard C++ library on Linux. It cannot interoperate with the C++ exception unwinder and the C++ exception unwinder cannot interoperate with managed code unwinder. There's more discussion on why this is not possible here: https://www.mono-project.com/docs/advanced/pinvoke/#runtime-exception-propagation. This also mentions the exception to this rule: using Visual C++ together with .NET Framework or CoreCLR on Windows - this mechanism doesn't exist with native AOT.

ceztko commented 9 months ago

The catch (...) part of your sample assumes the runtime would be able to convert the managed exception into a C++ exception that is catchable from C++. The throwing code is not C++.

@MichalStrehovsky So if I understood correctly, even in the case of Native AOT, throwing a "managed" natively compiled exception in the callback would not be able to propagate and unwind the stack in the external C++ code as in my sample, and safely be caught in the [UnmanagedCallersOnly] try-catch. It's a pity. Thanks for the mono project link!

The recommended portable solution is https://github.com/dotnet/runtime/issues/35017#issuecomment-614247258

@jkotas I had a look and I suggested an API to avoid the boilerplate code of checking and throwing exception on return from incompatible runtimes. Such API would apply to the use case of this issue as well.

jkotas commented 9 months ago

Can you provide more insights why this limitation should apply to this specific use case of NativeAOT, as in my sample?

Nothing fundamental prevents native AOT from supporting exception handling interop on Windows in the same shape as regular CoreCLR. The downside is that it would make the exception handling subsystem in native AOT more complicated.

The primary use case for exception handling interop on Windows is managed C++. Even though you can use it independently, it was specifically designed to enable managed C++. Managed C++ as a whole is not supported by native AOT.

ceztko commented 9 months ago

I think this issue can be closed since I got all the clarifications I asked for (throwing from [UnmanagedFunctionPointer] delegates is unsupported in Native AOT scenarios). For improvements to LibraryImport for possible removal of boiler plate code of checking/throwing exception at the return from P/Invoke calls I guess the conversation can be continued in the other issue.