cheverdyukv commented 4 years ago

I was using https://docs.microsoft.com/en-us/dotnet/core/tutorials/netcore-hosting#create-a-host-using-mscoreeh to host .NET runtime. But after discussion in #39167 I was told to use hostfxr to host .NET runtime.

But as far as I can see there is no way to pass APPDOMAIN_IGNORE_UNHANDLED_EXCEPTIONS and our application needs for 2 reasons:

We use some 3rd party components that create threads and raises unhandled exception.
Our application handles all unhandled exceptions anyway using appropriate callbacks, send report to us etc. We just don't need runtime to terminate process at that moment.

This is last issue that prevents us to use hostfxr

ghost commented 4 years ago

Tagging subscribers to this area: @vitek-karas, @swaroop-sridhar, @agocke Notify danmosemsft if you want to be subscribed.

jkotas commented 4 years ago

This option is left-over from Silverlight. It worked ok for Silverlight sandboxed applications that were severely limited in what they can do.

For .NET Core, there is number of ways managed exceptions can go unhandled even when this is set. I am not sure whether it makes sense to try to make it more first class in its current form. It is a poorly defined feature.

Related discussion: https://github.com/dotnet/runtime/issues/11546

cheverdyukv commented 4 years ago

Well it is already first class as it is exposed via public API (Create a host using Mscoree.h)
Perhaps that feature is not cover every possible case, but it covers 95% of our need
I believe host should not be terminated because some plugin throws exception. Yes, we can put try catch around every call even it requires months of work. But as I said nothing prevents plugin to create new thread, task etc and throw exception there.
If later we will discover that not all cases are covered, it is easier to deal with some rare case
As I mentioned before, right now we know for sure, that 3rd party component will throw in thread it creates and then will lead to host termination. I want to migrate hosting to use hostfxr but this issue prevents us to use hostfxr. And without hostfxr AssemblyDependencyResolver does not work.

rseanhall commented 4 years ago

This may not be supported, but if you call hostfxr_initialize_for_runtime_config or hostfxr_initialize_for_dotnet_command_line then that is enough to get AssemblyDependencyResolver to work. You can then load the runtime through Mscoree.h as you did before. This is assuming that you're loading the runtime from the same directory that hostfxr was going to load it from.

cheverdyukv commented 4 years ago

@rseanhall I don't really want to initialize it twice. It may create some unwanted effects. And I think that it not that hard to add support for what I want. Let's say I will set gv_hostfxr_set_runtime_property_value value "UNSUPPORTED.APPDOMAIN_IGNORE_UNHANDLED_EXCEPTIONS" to 1 and then couple of lines that will read it and set it. I think it is way simpler than initialize runtime twice.

rseanhall commented 4 years ago

I'm just trying to point out a potential workaround in versions of the runtime where this isn't implemented. It's not possible to initialize the runtime multiple times. The hostfxr_initialize_* methods don't initialize the runtime, they initialize hostfxr and get everything ready to initialize the runtime. The potentially unsupported thing is that the "get everything ready to initialize the runtime" part currently does everything that AssemblyDependencyResolver needs to actually work.

cheverdyukv commented 4 years ago

I appreciate your help and got your idea. I just feel that workaround will bite me later. But looks like there is no other choice. Thank you.

nxrighthere commented 3 years ago

Unhandled exceptions in embedded CoreCLR is a serious problem. In our case, the runtime is embedded into a game engine. To achieve a similar behavior like in Unity with Mono where unhandled exceptions are implicitly redirected to the console window and log files of the engine, we wrap the entire functionality with a single function like this, which invokes managed function pointers from unmanaged code here. As you can see, that means a very limited and predefined set of function signatures that can be used for execution, and it also involves performance cost.

This way we prevent termination of the process, and handle redirection of unhandled exceptions to log files, the console window, and on-screen messages of the engine.

It would be great if we get a hostfxr functionality that allows us to redirect exceptions to a custom function without handling them explicitly with try-catch blocks to prevent termination of the process.

vitek-karas commented 3 years ago

@nxrighthere : The important distinction between Mono's native APIs and the hostfxr based native hosting of CoreCLR is that Mono implements a full set of embedding APIs - so it can handle exceptions, GC interactions, reflection, managed object creation and so on, all from native code. That was not the aspiration of the hostfxr based native hosting APIs for CoreCLR. The design for hostfxr native hosting was to allow execution of a simple managed entry point from various scenarios. It is fully expected, that if a more complex communication is to be created between the native and managed side, there would be custom code to handle that. That includes exception handling.

We also discussed what it would mean to implement full embedding APIs for CoreCLR, but other than the obvious (lot of work) it would also require non-trivial amount of design - mono embedding API has known issues/limitations which we would like to avoid, CoreCLR has slightly different behavior in certain cases and so on. Hopefully we'll get to it at some later date.

The question in this issue is about handling exception on other threads when using native hosting APIs - which doesn't seem to be the issue you're trying to solve.

nxrighthere commented 3 years ago

Our only problem as stated in this issue:

Our application handles all unhandled exceptions anyway using appropriate callbacks, send report to us etc. We just don't need runtime to terminate process at that moment.

Personally, we don't really need embedding API similar to Mono, but we need a way to at least stop the termination of the process because in the development stage where a product is not ready for consumption by the end-user it makes no sense.

If a game is in development under the full control of IDE and a debugger before we ship it, what's the point to terminate the entire environment once an unhandled exception occurs? It makes sense to terminate the process when a game/application shipped to a customer, but before that, it doesn't carry any benefits.

nxrighthere commented 3 years ago

Please, can we reach a consensus regarding hostfxr API that will allow preventing termination of the process? We are considering forking CoreCLR to modify that part for our needs to stop crashing the editor of the engine as long as the application is in development. This issue is a ship stopper for us, and we can't implement new features for our customers because of that.

cheverdyukv commented 3 years ago

Same here, we cannot switch to hoxtfxr and have to use using-mscoreeh method to host runtime. I'm curing are they planning to use .NET 5 in SQL Server? How SQL Server team will solve it?

jkotas commented 3 years ago

they planning to use .NET 5 in SQL Server? How SQL Server team will solve it?

There are no plans to for SQL CLR based on .NET 5.

jkotas commented 3 years ago

@nxrighthere @cheverdyukv Could you please share API proposal(s) that you would like to see added to address your scenarios? The API signature (either managed or unmanaged), how the API would be used and details about the guarantees that the API would provide.

Let's treat this as a new API proposal and pretend that APPDOMAIN_IGNORE_UNHANDLED_EXCEPTIONS does not exist.

nxrighthere commented 3 years ago

A perfect solution in our case would be API similar to hostfxr_set_error_writer_fn, say hostfxr_set_unhandled_exceptions_writer_fn which prevents termination of the process and writes unhandled exceptions using a custom function. Our current implementation is very straightforward which writes unhandled exceptions like this:

Exception

jkotas commented 3 years ago

prevents termination of the process

It can be done on threads started by .NET runtime. What should it do on foreign threads?

cheverdyukv commented 3 years ago

I believe that foreign threads belongs to host and host should deal with them. Do runtime terminates process in this case?

nxrighthere commented 3 years ago

My proposal would be API that allows spawning threads from .NET runtime instead of using native platform-specific/language-specific API so the application that embedding hostfxr can use it instead.

jkotas commented 3 years ago

I believe that foreign threads belongs to host

They belong to whoever created them. For example, the thread may be created by the OS or a library.

Do runtime terminates process in this case?

On Windows, the runtime lets the exception to propagate using SEH. It typically results into exception being unhandled. It is a bad practices for libraries to handle exceptions that they do not understand. On Unix, the runtime terminates the process using abort.

jkotas commented 3 years ago

My proposal would be API that allows spawning threads from .NET runtime

It would not solve the problem with the foreign threads.

cheverdyukv commented 3 years ago

@jkotas Could you please elaborate on "On Windows, the runtime lets the exception to propagate using SEH"? If some host thread will call runtime and there is unhanded exception I think thread function should deal with that. If it is say OS thread pool thread and there is exception, allow thread pool deal with that. Basically exception would be propagated to caller. I'm not sure if I provided correct examples as you guys knows way more about different situations and it is why I would I need more details.

cheverdyukv commented 3 years ago

Host process should be in control on any runtime decision about whether terminate process or not. It is really like how SQL Server hosts .NET runtime. Runtime in this case is just guest in somebody’s house and it should not enforce its own rules. I can accept that certain exceptions like memory corruption, stack overflow or out of memory, will lead to process termination. Handling these cases is really hard. But all other type of unhandled exceptions should go via some global handler. Handler should accept exception object and returns if it is ok to terminate application.

I believe that load_assembly_and_get_function_pointer_type could be used to retrieve handler delegate and use something like hostfxr_set_runtime_property_value_type but this time it will be delegate. I believe signature should be something like that: public static void HandleUnhandledException(Exception exception, ref int actionToTake)

I think it should be in managed code, because there is no API to process exception. Depending on result, exception can be suppressed, suppressed and hostfxr_set_error_writer_fn can called, or default processing which is terminate process. Another approach would be to do it in unmanaged way and pass some information about exception that later could be recovered on managed site. So, if some code does not care about type of exception it is and only need message and maybe stack, then exception can be dealt with completely on unmanaged side.

nxrighthere commented 3 years ago

@jkotas If I understand you correctly, you would like to handle edge cases where the code in the host application is not aware of .NET runtime but still interoperate with managed code?

cheverdyukv commented 3 years ago

It is a bad practices for libraries to handle exceptions that they do not understand.

We are talking about host, not library.

jkotas commented 3 years ago

If it is say OS thread pool thread and there is exception, allow thread pool deal with that.

Right. This is exactly what happens today. The (Windows) OS thread pool leaves the exception unhandled and the process crashes.

you would like to handle edge cases where the code in the host application is not aware of .NET runtime but still interoperate with managed code?

I would like to know what you would like to happen in this case.

cheverdyukv commented 3 years ago

Right. This is exactly what happens today. The (Windows) OS thread pool leaves the exception unhandled and the process crashes.

Well exactly the same happens if there is no .net runtime and some unmanaged code throws exceptions. Runtime will not add anything to this process.

So my proposal, all threads that runtime creates should go via unhandled exceptions handler. Everything else goes the same way as if there is no runtime at all. If some unmanaged code calls runtime and runtime throws exception, it just propagates to caller and caller has to deal with it. I'm not sure if @nxrighthere will be ok with this.

@nxrighthere I suggest to use interfaces instead. Something like that and have one root function that returns base interface. And then interface can be called from any unmanaged language using standard things and it will return HRESULT in case of any exception. If FAILED check IErrorInfo interface (it is standard COM stuff) and get detailed information about exception. You can wrap interface calls in some macro for example. Interface is something like this: [ComVisible(true)] [InterfaceType(ComInterfaceType.InterfaceIsIUnknown)] [Guid("0A6BA400-21BA-4B9C-B970-AFD4167679B2")] public interface IMyIntf { void SomeFunction(string parameter); }

We are using this technique for many years and it works great. And it work both ways, so you can pass native interfaces to .NET and call them from .NET side. If you want more details, let me know.

jkotas commented 3 years ago

it is standard COM stuff

That only works for Windows. We have strong preference for cross-platform designs these days that means "C" for unmanaged APIs.

jkotas commented 3 years ago

We do have managed API for reporting unhandled exceptions: https://docs.microsoft.com/en-us/dotnet/api/system.appdomain.unhandledexception. This API is probably more convenient way to get the details of the exception (except for the very fatal exceptions where it is impossible or dangerous continue running managed code). Can the unmanaged API be just about deciding whether to ignore the exception or not?

In theory, the managed API can be in charge of deciding whether the ignore the exception too. I am not sure whether I would be able to convince people that it makes sense to have managed API for ignoring exceptions. It would make ignoring arbitrary exceptions too easy. I can give it a try if we decide that it is the right design.

nxrighthere commented 3 years ago

I would like to know what you would like to happen in this case.

I feel like a new exceptions handling mechanism similar to __try \ __except, but implemented as a hostfxr functionality is required here for such cases. For instance, in C, exceptions handling mechanisms are all custom, there's no standardized way of doing that.

jkotas commented 3 years ago

I feel like a new exceptions handling mechanism similar to try \ except, but implemented as a hostfxr functionality i

What would the API for this look like? __try \ __except is a Windows-specific compiler feature. It is not an API.

nxrighthere commented 3 years ago

Is it possible to completely forget about platform-specific SEH and signals (at least for the vast majority of exceptions) and introduce API that works specifically with managed exceptions using, say, thread-local storages to hold all required information similar to errno \ _get_errno? That's probably not the best idea, it's straight from the top of my head.

nxrighthere commented 3 years ago

I'm trying to wrap my head how a prototype would look like, but it seems like a dead-end:

struct hostfxr_exception {
    int32_t exception_code;
    const char_t* exception_message;
    ...
};

void hostfxr_exceptions_handler_initialize(void);
const struct hostfxr_exception* hostfxr_get_exception(void);

int main(void) {
    hostfxr_exceptions_handler_initialize(); // Initializes a thread-local storage and other required components

    managed_ptr();

    const struct hostfxr_exception* managed_ptr_exception;

    if ((managed_ptr_exception = hostfxr_get_exception()) != NULL) {
        printf("Exception code: %i\n", managed_ptr_exception->exception_code);
        abort();
    }
}

jkotas commented 3 years ago

You can achieve hostfxr_get_exception today by writing or auto-generating boilerplate code: try / catch on the managed side and return the error as unmanaged entity argument. For example, we are using this approach in managed wrapper of the JIT interface: https://github.com/dotnet/runtime/blob/master/src/coreclr/src/tools/Common/JitInterface/CorInfoBase.cs#L353 .

For COM or with DllImportAttribute.PreserveSig = false, the runtime generates this boilerplate code for you. The runtime-generated boilerplate code has same functionality and performance as what you can write.

We have been looking at using Roslyn source generators in interop: https://github.com/dotnet/runtimelab/tree/DllImportGenerator . The interop source generators may be able to make this scenario easier eventually.

Let's keep this scenario out of this discussion. I believe that it is a different problem.

cheverdyukv commented 3 years ago

We do have managed API for reporting unhandled exceptions: https://docs.microsoft.com/en-us/dotnet/api/system.appdomain.unhandledexception. This API is probably more convenient way to get the details of the exception (except for the very fatal exceptions where it is impossible or dangerous continue running managed code).

It is what we are using in our code right now.

Can the unmanaged API be just about deciding whether to ignore the exception or not?

This will work for us

In theory, the managed API can be in charge of deciding whether the ignore the exception too. I am not sure whether I would be able to convince people that it makes sense to have managed API for ignoring exceptions. It would make ignoring arbitrary exceptions too easy. I can give it a try if we decide that it is the right design.

I agree with you, that then it will be too easy to ignore all exceptions. Normal .NET application should follow .NET rules and do not ignore exceptions

nxrighthere commented 3 years ago

Is it possible to use AppDomain.UnhandledException in the case when all user assemblies are loaded into a single isolated AssemblyLoadContext so we can listen for events from there? If yes, then that will work for us.

jkotas commented 3 years ago

Is it possible to use AppDomain.UnhandledException in the case when all user assemblies are loaded into a single isolated AssemblyLoadContext

Yes.

cheverdyukv commented 3 years ago

Does runtime terminates process in case of unhandled exception?

nxrighthere commented 3 years ago

Then at this point our only issue is termination of the process.

jkotas commented 3 years ago

Here is a proposal:

Allow ignoring unhandled exceptions on threads created by the runtime from managed UnhandledException handler:

class UnhandledExceptionEventArgs
{
    // Existing property. Always true in .NET Core today. The behavior will be changed to:
    // - `false` for exceptions that can be ignored (ie thread was created by the runtime)
    // - `true` for exceptions that cannot be ignored (ie foreign thread or other situations when it is not reasonably possible to continue execution)
    // This behavior is close to .NET Framework 1.x behavior of this property.
    public bool IsTerminating { get; }

    // New property. The default value is false. The event handlers can set it to true to make
    // runtime ignore the exception. It has effect only when IsTerminating is false.
    // The documentation will come with usual disclaimer for bad consequences of ignoring exceptions
    public bool Ignore { get; set; }
}

We have a similar prior art that makes me think that having managed APIs for this should be acceptable:

UnobservedTaskExceptionEventArgs.Observed + UnobservedTaskExceptionEventArgs.SetObserved https://docs.microsoft.com/en-us/dotnet/api/system.threading.tasks.unobservedtaskexceptioneventargs.setobserved?view=netcore-3.1
CancelEventArgs.Cancel https://docs.microsoft.com/en-us/dotnet/api/system.componentmodel.canceleventargs.cancel?view=netcore-3.1

Allow unmanaged hosts to get details about fatal errors by registering callback. hostfxr_set_fatal_error_fn. When this callback is set, the runtime will call this method insted of printing the fatal error details to console. This callback will be called for unhandled exceptions, Environment.FailFast or fatal crashes. There won't be an option to recover from this error. The only action is process termination.

Thoughts? Do you need both 1. and 2.? Or 1. is enough for what you need?

cheverdyukv commented 3 years ago

1 will be enough for my case.

nxrighthere commented 3 years ago

Same, the first part should be good enough for us. In case of fatal errors, the game engine handles SEH \ signal to a custom bug reporter just fine.

cheverdyukv commented 3 years ago

But the more I think about it, the more misleading it looks. When IsTerminating is false, it looks like process will not be terminated, but in fact, if I understand it correctly, runtime will still terminate process if Ignored will be false.

Maybe it will be better to allow changing IsTerminating? Obviously fatal exceptions will ignore it

jkotas commented 3 years ago

I believe that it is a good idea to have separate "in" parameters of the event that should be immutable, and "out" parameters of the event. Otherwise, it gets confusing with multiple handlers. All existing event handlers with "out" parameters that I have looked at follow this pattern.

We can also leave the existing Terminating property alone and introduce a new property instead. What would the API shape be if there was no existing Terminating property?

nxrighthere commented 3 years ago

I would stick with the current proposal. Introducing a setter for IsTerminating is an anti-pattern. To me, the Ignore property sounds reasonable, this API at first glance is asking to pay attention carefully before using it, and that will be covered in the documentation.

jkotas commented 3 years ago

@janvorli @vitek-karas Could you please chime in your thoughts about the managed API shape? Once we get a local agreement, I am going to create API proposal issue to submit for API review approval.

john-h-k commented 3 years ago

Personally I think CanIgnore or IsFatal would be clearer than IsTerminating

vitek-karas commented 3 years ago

The proposal sounds good.

Do we know how mono/Xamarin behave in this area today? That is - is the IsTerminating also always true?

janvorli commented 3 years ago

The proposal makes sense to me too.

jkotas commented 3 years ago

Do we know how mono/Xamarin behave in this area today? That is - is the IsTerminating also always true?

I think so. @lambdageek Do you have any concerns about 1. from https://github.com/dotnet/runtime/issues/39587#issuecomment-691514918 from the Mono side?

lambdageek commented 3 years ago

Do we know how mono/Xamarin behave in this area today? That is - is the IsTerminating also always true?

I think so. @lambdageek Do you have any concerns about 1. from #39587 (comment) from the Mono side?

Right now Mono always passes IsTerminating = true.

If I understand the proposal (1), we would pass IsTerminating = false for threads created by the runtime, and respect the Ignore property when the handlers return to the runtime - at which point the thread with the unhandled exception dies, but the runtime continues running.

I don't think this will be a problem for Mono.

From an embedding point of view, I think it would be interesting to think about foreign threads that are allowed to set Ignore = true. For example native platform threadpool threads on Apple or Android platforms. It's not really clear how the host can express a policy for the system threadpool threads. The host may pass some some managed callback to a native API or to a third-party library which queues a platform threadpool task behind the scenes. Seems like none of the components would have the complete picture.

dotnet / runtime

How to pass APPDOMAIN_IGNORE_UNHANDLED_EXCEPTIONS using hostfxr #39587

1 will be enough for my case.