Closed jkotas closed 7 months ago
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.
Is “Ignorable” a potential alternative to “ IsTerminating” name? I find it a little confusing
IsTerminating
is an existing property. We can certainly leave the existing property alone and introduce a new one with a better name.
Ah of course
namespace System
{
public partial class UnhandledExceptionEventArgs : EventArgs
{
// Existing property.
// public bool IsTerminating { get; }
public bool Ignore { get; set; }
}
}
Is it possible to know when it will be implemented? .NET 5.0 or perhaps in one of the updates. Or perhaps .NET 6.0?
.NET 5.0 is done. We only ship critical bug fixes in servicing updates, no new features.
I am not sure whether the core .NET team will get to work on this in .NET 6.
Would you be interested in contributing the implementation yourself?
@jkotas After checking source code for quite some time, I feel that it should be done by someone really familiar with exception handling for different platforms. Let me explain why.
I found that C++ code function InternalUnhandledExceptionFilter_Worker
is core of exception handling. Looks like this SEH code and as result it is Windows specific. I have no idea where is code for other platforms and I have no idea how exceptions handled there. I also found that Mono has different way to handle this. I don't know about Blazor and how it works there.
But even for Windows how is it possible to know what should I set for IsTerminating for different type of exceptions? I checked that Stack Overload does not call UnhandledException but Out Of Memory does. I feel like IsTerminating should be set to false for all types of exceptions because to me looks like they all could be suppressed using that obsolete flag. Perhaps it will be better to invite here somebody from team who knows more about exceptions to help us decide?
From what I read in comments this problem is definitely not easy to solve and has a lot of tricks and caveats . Perhaps it will be easier to revert to original proposition with passing some flag during host creation. That code is already there and it is already able to suppress exceptions.
Next problem that certain code rely on IsTerminating. For example:
src\libraries\System.Data.OleDb\src\System\Data\ProviderBase\DbConnectionPoolCounters.cs
src\libraries\System.Runtime.Caching\src\System\Runtime\Caching\MemoryCache.cs
And perhaps there could be some 3rd party code that could rely on this as well.
As you can see there are calls to Dispose when process is terminating. But now IsTerminating will be false for most cases but application will still be terminating if nobody set Ignore to true. As result certain code could break. Probability is low, but it could happen.
I feel like IsTerminating should be set to false for all types of exceptions because to me looks like they all could be suppressed using that obsolete flag.
One specific case that cannot be possible suppressed are the unhandled exception on a foreign threads on Unix.
The existing obsolete hosting flag has zero testing. It is likely that it does not behave correctly in some cases.
Perhaps it will be better to invite here somebody from team who knows more about exceptions to help us decide?
cc @janvorli
original proposition with passing some flag during host creation
It would not make the problem with getting this right any easier.
Next problem that certain code rely on IsTerminating.
Good point. This will need careful thought.
One specific case that cannot be possible suppressed are the unhandled exception on a foreign threads on Unix.
Do you know best way to test it? I have feeling that these will not call unhandled exception at all including the same on Windows. For example in .NET Framework when some thread called .NET and there is exception that is unhandled, then it was passed to that environment as exception and that could be treated normally using standard SEH.
The existing obsolete hosting flag has zero testing. It is likely that it does not behave correctly in some cases.
I did check code and it looks like same (or quite similar) flag existed in .NET Framework. But I never used that flag and not sure how will it behave.
It would not make the problem with getting this right any easier.
Well I still working thru that code, but a have feeling that everything is done already :) except setting this flag of course. I just don't have time to do proper testing.
It seems this issue is also blocking an important plugin for Unreal Engine ( https://github.com/nxrighthere/UnrealCLR/issues/33 ) while this was moved to the Future milestone, I'd like to push this to be taken into consideration for .NET 6 or even .NET 7. It's kinda discouraging to see it stuck in that milestone that only god-knows-when will be done.
Considering the positive impact that project is making in the UE community as a whole, I think it's important to look at.
Excluding the UE topic, this issue is pretty valid to me in some apps I've developed in the past, too. Haven't needed it, but it would've been a good addition to some workarounds I've had to implement.
@jkotas should this be labeled up for grabs?
Next problem that certain code rely on IsTerminating.
Good point. This will need careful thought.
CancelEventArgs
is almost always used in an event with the "-ing" suffix. For example, the CancelEventArgs
is available in the Closing
event not the Closed
event. This suggests that the existing UnhandledException
event is not the right place to expose the ability to ignore an exception, and a new HandlingUnhandledException
event (probably with a better name) needs to be added instead.
Could an initial implementation only set a very small subset of exceptions as ignorable, and then future versions slowly expand which ones are ignorable? Theoretically, I would think this API could be added without ever setting any exceptions as ignorable and then people could slowly contribute which exceptions can be ignored.
This suggests that the existing UnhandledException event is not the right place to expose the ability to ignore an exception, and a new HandlingUnhandledException event (probably with a better name) needs to be added instead.
Ok, I have flipped this back to API needs work.
Thoughts on a good name and the exact shape of the API are welcomed.
Still not a great name.
Allow ignoring unhandled exceptions on threads created by the runtime from new managed UnhandledExceptionThrowing handler:
namespace System
{
public class AppDomain
{
public event UnhandledExceptionThrowingEventHandler? UnhandledExceptionThrowing;
}
public delegate void UnhandledExceptionThrowingEventHandler(object sender, UnhandledExceptionThrowingEventArgs e);
public class UnhandledExceptionThrowingEventArgs : EventArgs
{
public object ExceptionObject { get; }
// - `true` for exceptions that can be ignored (ie thread was created by the runtime)
// - `false` for exceptions that cannot be ignored (ie foreign thread or other situations when it is not reasonably possible to continue execution)
public bool Ignorable { get; }
// The default value is false. The event handlers can set it to true to make
// runtime ignore the exception. It has effect only when Ignorable is false.
// The documentation will come with usual disclaimer for bad consequences of ignoring exceptions
public bool Ignore { get; set; }
}
}
The exception will be reported in existing UnhandledException event, whether it's ignored or not. UnhandledExceptionEventArgs.IsTerminating
will be false if the exception was ignorable and ignored.
AppDomain.CurrentDomain.UnhandledExceptionThrowing += (sender, e) =>
{
if (DesignMode && e.Ignorable)
{
DisplayException(e.ExceptionObject);
e.Ignore = true;
}
};
I don't like Ignorable \ Ignore. Whoever handle the unhandled exception may not necessarily ignore it, maybe they will choose to not just take down the current process, but the whole OS as well. Or they could choose to ignore the unhandled exception. Point is, we only know that someone handled it. We don't know how they handled it. Ignoring it is just one way of handling it.
Better names: bool CanBeHandled; bool Handled;
We need this functionality.
Could you please provide any insight into when we will have this in .NET?
Could you please provide any insight into when we will have this in .NET?
@agocke This is the unhandled exception and fatal error handling scenario that I have mentioned to you. Do you think we will be able to work on it in .NET 9?
Yeah, let's try to get this done for .NET 9.
Tagging subscribers to this area: @vitek-karas, @agocke, @vsadov See info in area-owners.md if you want to be subscribed.
Author: | jkotas |
---|---|
Assignees: | - |
Labels: | `api-needs-work`, `area-Host` |
Milestone: | Future |
I want to pick this up and I am trying to figure where this ended last time.
What I see is:
IsTerminating
appears to be used in a few cases as a fact - whether an exception is terminal or not, so changing the meaning to mean "configurable" can be a breaking change to those uses.Are these all the reasons why we wanted to rethink the API ?
Are these all the reasons why we wanted to rethink the API ?
Yes, I think so.
It feels like the part that we already have scenarios where IsTerminating
is used to check for whether the exception is terminal or not, may require that we leave that alone and add a new event where listeners will have a chance to configure the outcome.
Basically
IsTerminating
specifying what we are going to do next - terminate or not. The second event is basically just a notification. Too late to configure anything.At least these are my thought right away for how to fit into existing scenarios.
The use case would look like:
AppDomain.CurrentDomain.UnhandledExceptionQuery += (sender, e) =>
{
if (DesignMode)
{
e.Terminate = false;
DebugLog("trying to ignore: ", e.ExceptionObject);
}
};
AppDomain.CurrentDomain.UnhandledException += (sender, e) =>
{
if (!e.IsTerminating)
{
DisplayException(e.ExceptionObject);
}
else
{
WrapItUpWeAreGoingToCrash();
}
};
public class UnhandledExceptionQueryEventArgs
{
// defaults to true
// setting false will cause the exception not be terminal
// all listeners need to agree (sadly, the order of listeners matters)
public bool Terminate { get; set; }
public object ExceptionObject { get; }
}
public class UnhandledExceptionEventArgs
{
// Existing property. Always true in .NET Core today. Will be false if termination was overridden.
public bool IsTerminating { get; }
public object ExceptionObject { get; }
}
AppDomain.CurrentDomain.UnhandledExceptionQuery
What about UnhandledExceptionHandler
that returns boolean? If the handler returns true, the exception is considered handled and we are done. The existing AssemblyResolve events are prior art for shape like this.
Also, we may want to put this on a new type under System.Runtime.ExceptionServices
where the other fatal error handlers going to be.
Then send the existing event with IsTerminating specifying what we are going to do next - terminate or not. The second event is basically just a notification. Too late to configure anything.
I am not sure about this part. IsTerminating
behavior is poorly defined and it is always true (unless one uses the unsupported config switch). I would keep UnhandledException
callback to be called only when we are guaranteed that the process is terminating.
What about UnhandledExceptionHandler that returns boolean? If the handler returns true, the exception is considered handled and we are done. The existing AssemblyResolve events are prior art for shape like this.
How does that work with multiple listeners? The last wins?
How does that work with multiple listeners? The last wins?
The first one wins. It is how AssemblyResolve and similar events work today. One example from many: https://github.com/dotnet/runtime/blob/aee49579769188d0ff7cf3ca872d2126e5bb3c70/src/libraries/System.Private.CoreLib/src/System/Runtime/Loader/AssemblyLoadContext.cs#L811-L821
Right, Delegate.EnumerateInvocationList
That would work.
I would keep UnhandledException callback to be called only when we are guaranteed that the process is terminating.
Does that imply there is another new event for nonterminal unhandled exceptions or only the handlers up to the one that returned true
will know about those?
I suppose the recommended use would be to not have multiple handlers, or at least make them handle different exceptions or be responsible for different scenarios. With that view, it might be ok that once exception "handled" noone else sees it.
Does that imply there is another new event for nonterminal unhandled exceptions or only the handler that returned true will know about those?
Yes. (All UnhandledExceptionHandler's that were called before the one that handled it would know about it too of course.)
Separately, we may want to have an event that is triggered when an exception (any exception) is handled. We have AppDomain.FirstChanceException event that is triggered when exception is thrown, but we do not have one for handled exceptions. I think it would help with #98878.
I suppose the recommended use would be to not have multiple handlers, or at least make them handle different exceptions or be responsible for different scenarios.
Right. If we want to avoid conflicts between different handlers, we may want to only allow setting one per app. It would help with ensuring that the unhandled exception policy is only controlled at app level and that random libraries do not participate in it. NativeLibrary.SetDllImportResolver
is an example of prior art like this.
Separately, we may want to have an event that is triggered when an exception (any exception) is handled.
I suppose that includes the ordinary catch
and the handler event. Also in rethrow case the same exceptions could be caught more than once.
I wonder if there is a need or even a possibility to identify the "catcher".
I suppose that includes the ordinary catch and the handler event. Also in rethrow case the same exceptions could be caught more than once.
Right.
I wonder if there is a need or even a possibility to identify the "catcher".
I think that the Stacktrace APIs would be a solution for that. It is expensive to do it eagerly, for AOT in particular.
So, for the unhandled exception handler we will have:
AppDomain.CurrentDomain.UnhandledExceptionHandler += (sender, e) =>
{
if (DesignMode)
{
DisplayException(e.ExceptionObject);
// the exception is now "handled"
return true;
}
};
AppDomain.CurrentDomain.UnhandledException += (sender, e) =>
{
// IsTerminating is always true for unhandled exceptions (assuming this is not .NET Fx)
Debug.Assert(e.IsTerminating);
WrapItUpWeAreGoingToCrash();
};
public delegate bool UnhandledExceptionHandlerEventHandler(object sender, System.UnhandledExceptionHandlerEventArgs e);
public class UnhandledExceptionHandlerEventArgs
{
public object ExceptionObject { get; }
}
Right. Open design decisions:
Looks like Mono has tests for IsTerminating==false
. Is that working in Mono?
runtime/src/mono/mono/tests/threadpool-exceptions2.cs
These are orphaned tests. You should check the actual behavior.
We do not have a lot of test coverage for unhandled exceptions in general so there can be untracked behavior differences between runtimes.
I think either way, we can say that once UnhandledExceptionHandler
returns true, this is no longer an unhandled case so CurrentDomain.UnhandledException
is not called, regardless of the runtime.
I'd prefer ExceptionServices. This can be seen as orthogonal to CurrentDomain.UnhandledException
, thus does not need to live near it.
I think I might prefer a single shot delegate (the DllImportResolver
style).
Solves the problem with multiple handlers. Or at least moves it to the user, who still can build something pluggable or flow this into an event.
But I'd like to hear from the likely users.
With above assumptions, the use case will be something like:
using System.Runtime.ExceptionServices;
ExceptionHandling.SetUnhandledExceptionHandler(
(ex) =>
{
if (DesignMode)
{
DisplayException(ex);
// the exception is now "handled"
return true;
}
}
);
namespace System.Runtime.ExceptionServices
{
public delegate bool UnhandledExceptionHandler(System.Exception exception);
public static class ExceptionHandling
{
/// <summary>
/// Sets a handler for unhandled exceptions.
/// </summary>
/// <exception cref="ArgumentNullException">If handler is null</exception>
/// <exception cref="InvalidOperationException">If a handler is already set</exception>
public static void SetUnhandledExceptionHandler(UnhandledExceptionHandler handler);
}
}
// can be called multiple times - new handler replaces old.
// calling with `null` unsets the handler
I think it is unnecessary flexibility - it does not prevent different libraries from fighting over who is going to win. NativeLibrary.SetDllImportResolver
can be called exactly once for given assembly if you go with that as prior art.
I think it is unnecessary flexibility - it does not prevent different libraries from fighting over who is going to win. NativeLibrary.SetDllImportResolver can be called exactly once for given assembly if you go with that as prior art.
I was mostly thinking that "unsetting" may be a desired scenario. Replacing is a side effect of allowing unsetting. - once you can unset, you'd want to be able to set again, and then why not to allow replacing. But outright swapping does feel a bit odd for a scenario.
It could certainly be a one-time API like the resolver. In most cases, I agree, it will be set once and set early - at the app startup or once the runtime is initialized (in hosted scenario like game scripting).
I'll update the example.
For the semantics of unhandled exception handler I think we can follow the model of imaginary
try { UserCode(); } catch (Exception ex) when handler(ex){};
in places where the above will not lead to process termination regardless of what handler()
returns.
false
.Any other interesting scenario or a corner case?
LGTM
@joncham Does https://github.com/dotnet/runtime/issues/42275#issuecomment-2008339882 for your scenario?
a reverse pinvoke will not install the try/catch like above.
Does this mean an unhandled exception in a reverse pinvoke will not call the UnhandledExceptionHandler
and the process will terminate? I am not sure if exceptions thrown in reverse pinvokes have a defined behavior today, but we would prefer to be able to handle as many cases as possible versus crashing the process.
main() will not install the try/catch like above
In this case, is the AppDomain UnhandledException
event called? In general, is there any case where UnhandledExceptionHandler
will not be called but UnhandledException
is called?
It could certainly be a one-time API like the resolver. In most cases, I agree, it will be set once and set early - at the app startup or once the runtime is initialized (in hosted scenario like game scripting).
In our hosted scenario (Unity Editor) we would want to install the single handler, and not allow it to be replaced/overriden.
I am not sure if exceptions thrown in reverse pinvokes have a defined behavior today
On Windows CoreCLR, exceptions escaping from reverse PInvokes are converted to Windows SEH exceptions. Exceptions escaping from reverse PInvokes are treated as unhandled exceptions everywhere else.
we would prefer to be able to handle as many cases as possible versus crashing the process.
To handle unhandled exception in reverse PInvoke, we would have to return something to the unmanaged code that called the reverse PInvoke. What would that be? Returning random values and hope for the best does not sound like a good plan.
Now, for the API for intercepting fatal crashes - I am thinking about just allowing to plug into
CrashDumpAndTerminateProcess
.
The goal of this API is to allow 3rd party extension of intercepting fatal process crashes.
The actual handler must be in native code, since running managed code while crashing is not a good idea. In fact we may need to run this in a signal handler, so it would need to be signal-safe.
One prior suggestion for the API was in https://github.com/dotnet/runtime/issues/79706#issuecomment-1700243612
The API could be:
public static class ExceptionHandling
{
// .NET runtime is going to call `fatalErrorHandler` set by this method before its own
// fatal error handling (creating .NET runtime-specific crash dump, etc.). This can be only called once in given
// process.
public static void SetFatalErrorHandler(delegate* unmanaged<uint, void> fatalErrorHandler);
}
It could really be just something that CrashDumpAndTerminateProcess
calls before producing dump and terminating the process.
It means that the signature would be basically the same as for CrashDumpAndTerminateProcess
. More info could be added, but currently it is:
extern "C" DLL_EXPORT void __cdecl FatalErrorHandler(uint32_t exitCode)
{
// native implementation with signal handler restrictions
}
Typical use would be something like:
internal class Program
{
[UnmanagedCallersOnly]
[DllImport("myCustomCrashHandler.dll")]
public static extern void FatalErrorHandler(uint exitCode);
unsafe static void Main(string[] args)
{
ExceptionHandling.SetFatalErrorHandler(&FatalErrorHandler);
RunMyProgram();
}
}
In the same spirit as in the SetUnhandledExceptionHandler
, setting the handler would be allowed just once per process.
Questions:
is the location/timing of this call sufficient for the purpose? It should be, since it would be called right before producing the .NET dump
any other info that could be helpful to the handler?
do we need the handler to communicate something back - like "do not do your dump" ? It would basically mean the handler may need to return nonvoid result.
https://github.com/dotnet/runtime/issues/79706#issuecomment-1700243612 also suggested GetIsManagedCode()
.
I am not sure how that would be helpful. Is there a back story to that?
any other info that could be helpful to the handler?
We may want to pass in all information that is required to implement our own fatal error handler:
siginfo_t *info
and void *ucontext
on Unix. EXCEPTION_POINTERS*
on Windows that can be broken down into EXCEPTION_RECORD* ExceptionRecord
and CONTEXT* ContextRecord
to make it look more like Unix.Environment.FailFast
call.This looks like too many pieces to pass as individual arguments. We may want to stash it all into a struct and pass the pointer to the struct into the handler.
is the location/timing of this call sufficient for the purpose?
The current fatal error processing does multiple things:
We may want to provide more fine-grained control over all these steps or insert this callback as the very first step, before anything gets printed to the console.
79706 (comment) also suggested GetIsManagedCode(). I am not sure how that would be helpful. Is there a back story to that?
Yeah, the proposed GetIsManagedCode
callback is not a good design, but the problem that it tried to solve it still there.
The problem you hit when implementing signal handlers on Unix is whether your signal handler should take over the process (works well for executables) or whether it should cooperate with other components that may be loaded in the process (works well for libraries).
If the component signal handler sees that the crashing IP is in the component code, it can assume that the component should handle it. The problem is with what to do if the component signal handler sees that the IP is in somebody else's code. Should it pass it to the previous signal handler (there may be none) or should it take over reporting it as a fatal crash?
We may want to provide more fine-grained control over all these steps or insert this callback as the very first step, before anything gets printed to the console.
Right. If the idea is that the handler should be able to completely take over and do just its thing, we may also need a way for the handler to tell that the runtime's actions are not interesting - i.e. by returning false
.
We would need to ensure that all the crash paths go through it though. CrashDumpAndTerminateProcess
is a convenient choke point, but for the handler to take over, we would need to call it earlier. We may have to do some processing of the crash if we want to provide more info to the handler, but we will have to call it before we did anything observable to the end-user like printing to console or dumping files.
We may want to pass in all information that is required to implement our own fatal error handler
That is basically the stuff passed to EEPolicy::HandleFatalError
. That was the other candidate from where to call the handler and pass all the info that is known at the time.
The only confusing part is that not all crashing paths bypass EEPolicy::HandleFatalError
and go directly to CrashDumpAndTerminateProcess
, but I think it can be changed.
In a signal case we may need to call them earlier though.
This looks like too many pieces to pass as individual arguments. We may want to stash it all into a struct and pass the pointer to the struct into the handler.
Some of these pieces would be optional and will have default values, depending on scenario (i.e. AV vs. SO vs. intentional failfast).
I think having many arguments is ok. It might also be easier to add something to the argument list in a compatible way in V-next, if needed.
The problem you hit when implementing signal handlers on Unix is whether your signal handler should take over the process (works well for executables) or whether it should cooperate with other components that may be loaded in the process (works well for libraries).
I have only two ideas here:
true
we continue with our routine, which might end up calling another signal handler.
Background and Motivation
Scenarios like designers or REPLs that host user provided code are not able to handle unhandled exceptions thrown by the user provided code. Unhandled exceptions on finalizer thread, threadpool threads or user created threads will take down the whole process. This is not desirable experience for these type of scenarios.
The discussion that lead to this proposal is in https://github.com/dotnet/runtime/issues/39587
Proposed API
Allow ignoring unhandled exceptions on threads created by the runtime from managed UnhandledException handler:
Usage Examples
Alternative Designs
Unmanaged hosting API that enables this behavior. (CoreCLR has poorly documented and poorly tested configuration option for this today.)
Similar prior art:
UnobservedTaskExceptionEventArgs.Observed
+UnobservedTaskExceptionEventArgs.SetObserved
CancelEventArgs.Cancel
Risks
This API can be abused to ignore unhandled exceptions in scenarios where it is not warranted.