dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.97k stars 4.66k forks source link

Unobserved Task Exception after closing ClientWebSocket #80116

Open mus65 opened 1 year ago

mus65 commented 1 year ago

Description

We randomly get the following unobserved task exception in our logs:

Unobserved Task Exception: System.AggregateException: A Task's exception(s) were not observed either by Waiting on the Task or accessing its Exception property. As a result, the unobserved exception was rethrown by the finalizer thread. (Unable to read data from the transport connection: Der E/A-Vorgang wurde wegen eines Threadendes oder einer Anwendungsanforderung abgebrochen..)
 ---> System.IO.IOException: Unable to read data from the transport connection: Der E/A-Vorgang wurde wegen eines Threadendes oder einer Anwendungsanforderung abgebrochen..
 ---> System.Net.Sockets.SocketException (995): Der E/A-Vorgang wurde wegen eines Threadendes oder einer Anwendungsanforderung abgebrochen.
   --- End of inner exception stack trace ---
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource<System.Int32>.GetResult(Int16 token)
   at System.Net.Security.SslStream.EnsureFullTlsFrameAsync[TIOAdapter](TIOAdapter adapter)
   at System.Net.Security.SslStream.ReadAsyncInternal[TIOAdapter](TIOAdapter adapter, Memory`1 buffer)
   at System.Net.Http.HttpConnection.ReadBufferedAsyncCore(Memory`1 destination)
   at System.Net.Http.HttpConnection.RawConnectionStream.ReadAsync(Memory`1 buffer, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---

I managed to narrow this down to the ClientWebSocket close handshake and can reproduce this (see Reproduction Steps below). It happens when the closing of the connection takes too long after the close handshake finished. I think the issue is here:

https://github.com/dotnet/runtime/blob/ac2ffdf4ff87e7e3a5506a8ef69ce633f6fcda85/src/libraries/System.Net.WebSockets/src/System/Net/WebSockets/ManagedWebSocket.cs#L985

When WaitAsync() throws a TimeoutException, it gets ignored, but the exception from the original finalReadTask (which may happen later) is never observed in this case.

Reproduction Steps

using System.Net.WebSockets;

namespace WebApplication2
{
    public class Program
    {
        public static async Task Main(string[] args)
        {
            TaskScheduler.UnobservedTaskException += (obj, args) =>
            {
                Console.Error.WriteLine("Unhandled exception: " + args.Exception);
            };

            var builder = WebApplication.CreateBuilder(args);
            var app = builder.Build();

            app.UseWebSockets();

            app.Use(async (HttpContext context, RequestDelegate next) =>
            {
                WebSocket webSocket = await context.WebSockets.AcceptWebSocketAsync();
                await webSocket.ReceiveAsync(new ArraySegment<byte>(new byte[4096]), CancellationToken.None);
                await webSocket.CloseOutputAsync(WebSocketCloseStatus.NormalClosure, "normal", CancellationToken.None);
                await Task.Delay(10_000);
            });

            await app.StartAsync();

            ClientWebSocket clientWebSocket = new ClientWebSocket();
            await clientWebSocket.ConnectAsync(new Uri("ws://localhost:5065"), CancellationToken.None);
            await clientWebSocket.CloseOutputAsync(WebSocketCloseStatus.NormalClosure, "normal", CancellationToken.None);
            await clientWebSocket.ReceiveAsync(new ArraySegment<byte>(new byte[4096]), CancellationToken.None);
            clientWebSocket.Dispose();
            await Task.Delay(5_000);
            GC.Collect(2);
            GC.WaitForPendingFinalizers();
            await Task.Delay(100_000);
        }
    }
}

Expected behavior

No unobserved task exception.

Actual behavior

Unobserved task exception:

Unobserved Task Exception: System.AggregateException: A Task's exception(s) were not observed either by Waiting on the Task or accessing its Exception property. As a result, the unobserved exception was rethrown by the finalizer thread. (Unable to read data from the transport connection: Der E/A-Vorgang wurde wegen eines Threadendes oder einer Anwendungsanforderung abgebrochen..)
 ---> System.IO.IOException: Unable to read data from the transport connection: Der E/A-Vorgang wurde wegen eines Threadendes oder einer Anwendungsanforderung abgebrochen..
 ---> System.Net.Sockets.SocketException (995): Der E/A-Vorgang wurde wegen eines Threadendes oder einer Anwendungsanforderung abgebrochen.
   --- End of inner exception stack trace ---
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource<System.Int32>.GetResult(Int16 token)
   at System.Net.Security.SslStream.EnsureFullTlsFrameAsync[TIOAdapter](TIOAdapter adapter)
   at System.Net.Security.SslStream.ReadAsyncInternal[TIOAdapter](TIOAdapter adapter, Memory`1 buffer)
   at System.Net.Http.HttpConnection.ReadBufferedAsyncCore(Memory`1 destination)
   at System.Net.Http.HttpConnection.RawConnectionStream.ReadAsync(Memory`1 buffer, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---

Regression?

No response

Known Workarounds

No response

Configuration

.NET 6.0.11 on Windows 11 x64.

Other information

No response

ghost commented 1 year ago

Tagging subscribers to this area: @dotnet/ncl See info in area-owners.md if you want to be subscribed.

Issue Details
### Description We randomly get the following unobserved task exception in our logs: ``` Unobserved Task Exception: System.AggregateException: A Task's exception(s) were not observed either by Waiting on the Task or accessing its Exception property. As a result, the unobserved exception was rethrown by the finalizer thread. (Unable to read data from the transport connection: Der E/A-Vorgang wurde wegen eines Threadendes oder einer Anwendungsanforderung abgebrochen..) ---> System.IO.IOException: Unable to read data from the transport connection: Der E/A-Vorgang wurde wegen eines Threadendes oder einer Anwendungsanforderung abgebrochen.. ---> System.Net.Sockets.SocketException (995): Der E/A-Vorgang wurde wegen eines Threadendes oder einer Anwendungsanforderung abgebrochen. --- End of inner exception stack trace --- at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken) at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token) at System.Net.Security.SslStream.EnsureFullTlsFrameAsync[TIOAdapter](TIOAdapter adapter) at System.Net.Security.SslStream.ReadAsyncInternal[TIOAdapter](TIOAdapter adapter, Memory`1 buffer) at System.Net.Http.HttpConnection.ReadBufferedAsyncCore(Memory`1 destination) at System.Net.Http.HttpConnection.RawConnectionStream.ReadAsync(Memory`1 buffer, CancellationToken cancellationToken) --- End of inner exception stack trace --- ``` I managed to narrow this down to the ClientWebSocket close handshake and can reproduce this (see Reproduction Steps below). It happens when the closing of the connection takes too long after the close handshake finished. I think the issue is here: https://github.com/dotnet/runtime/blob/ac2ffdf4ff87e7e3a5506a8ef69ce633f6fcda85/src/libraries/System.Net.WebSockets/src/System/Net/WebSockets/ManagedWebSocket.cs#L985 When `WaitAsync()` throws a `TimeoutException`, it gets ignored, but the exception from the original `finalReadTask` (which may happen later) is never observed in this case. ### Reproduction Steps ``` using System.Net.WebSockets; namespace WebApplication2 { public class Program { public static async Task Main(string[] args) { TaskScheduler.UnobservedTaskException += (obj, args) => { Console.Error.WriteLine("Unhandled exception: " + args.Exception); }; var builder = WebApplication.CreateBuilder(args); var app = builder.Build(); app.UseWebSockets(); app.Use(async (HttpContext context, RequestDelegate next) => { WebSocket webSocket = await context.WebSockets.AcceptWebSocketAsync(); await webSocket.ReceiveAsync(new ArraySegment(new byte[4096]), CancellationToken.None); await webSocket.CloseOutputAsync(WebSocketCloseStatus.NormalClosure, "normal", CancellationToken.None); await Task.Delay(10_000); }); await app.StartAsync(); ClientWebSocket clientWebSocket = new ClientWebSocket(); await clientWebSocket.ConnectAsync(new Uri("ws://localhost:5065"), CancellationToken.None); await clientWebSocket.CloseOutputAsync(WebSocketCloseStatus.NormalClosure, "normal", CancellationToken.None); await clientWebSocket.ReceiveAsync(new ArraySegment(new byte[4096]), CancellationToken.None); clientWebSocket.Dispose(); await Task.Delay(5_000); GC.Collect(2); GC.WaitForPendingFinalizers(); await Task.Delay(100_000); } } } ``` ### Expected behavior No unobserved task exception. ### Actual behavior Unobserved task exception: ``` Unobserved Task Exception: System.AggregateException: A Task's exception(s) were not observed either by Waiting on the Task or accessing its Exception property. As a result, the unobserved exception was rethrown by the finalizer thread. (Unable to read data from the transport connection: Der E/A-Vorgang wurde wegen eines Threadendes oder einer Anwendungsanforderung abgebrochen..) ---> System.IO.IOException: Unable to read data from the transport connection: Der E/A-Vorgang wurde wegen eines Threadendes oder einer Anwendungsanforderung abgebrochen.. ---> System.Net.Sockets.SocketException (995): Der E/A-Vorgang wurde wegen eines Threadendes oder einer Anwendungsanforderung abgebrochen. --- End of inner exception stack trace --- at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken) at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token) at System.Net.Security.SslStream.EnsureFullTlsFrameAsync[TIOAdapter](TIOAdapter adapter) at System.Net.Security.SslStream.ReadAsyncInternal[TIOAdapter](TIOAdapter adapter, Memory`1 buffer) at System.Net.Http.HttpConnection.ReadBufferedAsyncCore(Memory`1 destination) at System.Net.Http.HttpConnection.RawConnectionStream.ReadAsync(Memory`1 buffer, CancellationToken cancellationToken) --- End of inner exception stack trace --- ``` ### Regression? _No response_ ### Known Workarounds _No response_ ### Configuration .NET 6.0.11 on Windows 11 x64. ### Other information _No response_
Author: mus65
Assignees: -
Labels: `area-System.Net`
Milestone: -
MihaZupan commented 1 year ago

Thanks for the great repro, this does look like a bug. Introduced by #56282 (in .NET 6.0).

This should be fairly easy to fix (that task should be awaited in the catch block after the call to Abort). @mus65 would you be interested in contributing a PR?

MihaZupan commented 1 year ago

Triage: UnobservedTaskException are annoying and this one should be simple to fix, moving to 8.0 for now.

mus65 commented 1 year ago

@MihaZupan But the await would need to wrapped in another try/catch, right? Otherwise we would throw here which currently cant' happen.

Would this change require a test case? If not, I can make a PR. I'm not sure If I could write a consistent test case for this (especially since I assume I can't use ASP .NET Core in runtime tests).

MihaZupan commented 1 year ago

I think the cleanest option would be to revert to how this was written before #56282, something like

using var finalCts = new CancellationTokenSource(WaitForCloseTimeoutMs);
using (finalCts.Token.UnsafeRegister(static s => ((ManagedWebSocket)s!).Abort(), this))
{
    try
    {
        await finalReadTask;
    }
    catch { }
}

Re: tests, it's generally harder to write ones that verify that all task exceptions are observed. While having a regression test to make sure we don't reintroduce the same mistake in the future would be nice, we'd take a PR even without it.

We don't use ASP.NET in our tests, but we have our own "loopback server". You can see some examples of how that might look like here: https://github.com/dotnet/runtime/blob/main/src/libraries/System.Net.WebSockets.Client/tests/SendReceiveTest.cs.

stephentoub commented 1 year ago

I think the cleanest option would be to revert to how this was written before https://github.com/dotnet/runtime/pull/56282, something like

finalReadTask is a ValueTask. Wouldn't your suggestion result in it possibly being consume twice? Or have I misunderstood what you're suggesting?

stephentoub commented 1 year ago

While having a regression test to make sure we don't reintroduce the same mistake in the future would be nice, we'd take a PR even without it.

I think a PR needs a test. If it's worth fixing this to ensure we're not raising the event, then it's worth having a test for it.

MihaZupan commented 1 year ago

finalReadTask is a ValueTask. Wouldn't your suggestion result in it possibly being consume twice? Or have I misunderstood what you're suggesting?

I meant we'd replace the current logic (WaitAsync) with setting up the CTS and await the value task once like in my example above. I don't think that could lead to awaiting twice?

stephentoub commented 1 year ago

I meant we'd replace the current logic (WaitAsync) with setting up the CTS and await the value task once like in my example above. I don't think that could lead to awaiting twice?

Ah, replacing it would be fine from the perspective of waiting for it twice.

However, there is no guarantee that Dispose'ing of the Stream will result in the task returned from ReadAsync completing. That's why it's written the way it is.

MihaZupan commented 1 year ago

However, there is no guarantee that Dispose'ing of the Stream will result in the task returned from ReadAsync completing. That's why it's written the way it is.

We're relying on this being the case in HttpConnection though, no?

stephentoub commented 1 year ago

Yes, but only because at the time it was written we had no other option, e.g. there was no cancellation story for NetworkStream / Socket. I believe there's an open issue to switch HttpConnection over to using CancellationTokens threaded through all the calls instead of this approach with disposing streams, which is not thread-safe and not guaranteed to work, especially with ConnectCallback able to supply an arbitrary Stream.

wfurt commented 3 months ago

triage: not critical for 9.0, moving to future

flagbug commented 2 months ago

We're seeing the same issue in our application and our Sentry error reporting is picking up these unobserved task exceptions. Took forever to determine that this is not a problem in our own code, but in the framework 😬

LeadAssimilator commented 4 days ago

Same issue with Udp/TcpClient. After closing it, every pending receive operation started with BeginReceive will emit the aforementioned unobserved task exception.

Just ran into this on maui ios where the unobserved task handler was attempting to break if a debugger was attached, which due to some other bug, just hangs 95%+ of the time - what fun that was tracking down.

This bug should have been fixed by now...having the runtime generate these kinds of errors just wastes everyone's time and needlessly increases SNR. We rely on the UnobservedTaskException to point out where we are being idiotic, but if the runtime is being an idiot too, then we really can't reliably use it as a safety check.