Closed maksimkim closed 2 years ago
Notes:
That exception message maps to: ERROR_OPERATION_ABORTED 995 (0x3E3) https://docs.microsoft.com/en-us/windows/win32/debug/system-error-codes--500-999-
You're setting the response Content-Length and writing the entire body at once, so the native API that would be returning this error would be HttpSendHttpResponse. https://docs.microsoft.com/en-us/windows/win32/api/http/nf-http-httpsendhttpresponse
The biggest difference setting the Content-length causes is that the server knows this is the last write so it doesn't set HTTP_SEND_RESPONSE_FLAG_MORE_DATA and do another zero length call to HttpSendResponseEntityBody later to finish the body. That might mean the issue is specific to HttpSendHttpResponse or that it's a matter of timing.
Accessing CallCancelled shouldn't have any effect on response writes, it must be causing some state change inside Http.Sys by calling HttpWaitForDisconnectEx.
Modified the server example to split response payload into 2 chunks and can confirm that only the last write fails. The first one never fails.
Update. 1) Issue is reproducible on both Win10 and Win11 2) Decrypted TLS shows the following pattern for each request:
Meaning TCP connections are not reused and gracefully closed by the client after each request. Also there's 3rd 0-length DATA frame which is sent by http.sys on it's own. App doesn't send it.
3) Issue has a clear correlation with accessing context.Request.CallCancelled. Without it no errors pop up in the application on any OS.
I was able to repro this using Win11 and the curl implementation from WSL.
for i in {1..5000}; do curl -k https://localhost:9090 --resolve localhost:9090:172.23.208.1 --output /dev/null; done
Confirmed it goes away if I comment out the line accessing CallCancelled. It's not clear how that would affect the last write since the token isn't passed to the write.
Interestingly it does repro against Katana, but not against AspNetCore's HttpSys server (6.0, 5.0, nor 2.1 on .NET Core 2.1 or .NET Framework 4.x). The AspNetCore implementation is derived from the Owin and HttpListener one. https://github.com/aspnet/AspNetKatana/blob/dev/src/Microsoft.Owin.Host.HttpListener/DisconnectHandler.cs https://github.com/dotnet/aspnetcore/blob/release/2.1/src/Servers/HttpSys/src/NativeInterop/DisconnectListener.cs
I don't see any significant differences in the disconnect registration logic. The biggest differences I can find between the server implementations is that AspNetCore uses ThreadPoolBoundHandle instead of ThreadPool.BindHandle to set up its overlapped operations.
Variations tried:
HttpListener.IgnoreWriteExceptions = true;
nor HttpSysOptions.ThrowWriteExceptions = true
changed the behavior in either repro.Content-Length
: This causes the underlying server to assume Transfer-Encoding: chunked
and issue one more write when cleaning up the response. I didn't see that extra write ever fail with this error, but even observing that write inside the server requires the debugger and may throw off the timing.If I check the token state before and after the write like this:
try
{
if (token.IsCancellationRequested) Console.WriteLine("Token Canceled before write.");
await res.Body.WriteAsync(this.payload, 0, this.payload.Length);
if (token.IsCancellationRequested) Console.WriteLine("Token Canceled after write.");
}
catch (Exception ex)
{
Console.WriteLine("Exception. IsTokenCanceled? " + token.IsCancellationRequested + "; " + ex.InnerException?.Message);
throw;
}
I find that the token is frequently canceled after the write even in success cases. The token is always canceled after when I get this exception, so they're certainly related.
Token Canceled after write.
Token Canceled after write.
Token Canceled after write.
Token Canceled after write.
Token Canceled after write.
Token Canceled after write.
Token Canceled after write.
Token Canceled after write.
Token Canceled after write.
Exception. IsTokenCanceled? True; The I/O operation has been aborted because of either a thread exit or an application request
But why is the token cancelled? I see a TLS alert warning in that trace.
In AspNetCore the token is also frequently canceled after the write finishes, but there's no write exception.
Offline update: I captured some ETW traces and they showed that HttpCancelHttpRequest was called after HttpWaitForDisconnectEx, and that's why HttpSendHttpResponse failed. I was able to backtrack that to this callstack that shows it being called indirectly from our HttpWaitForDisconnectEx callback.
AspNetCore does not immediately call Abort when the token fires. That explains the different behaviors across libraries. https://github.com/dotnet/aspnetcore/blob/207854d30d1c2e9ea93d1df64044c3724ce5d5cd/src/Servers/HttpSys/src/RequestProcessing/RequestContext.cs#L57
There’s still the question of what’s causing HttpWaitForDisconnectEx to fire, and why it happens so quickly that HttpSendHttpResponse fails even though the client is able to consume the response. I did some testing with an alternate client that re-uses the connection and saw a similar result. Here’s what I think is happening:
I've confirmed with the Windows team that this is the expected behavior, and that the race between HttpWaitForDisconnectEx and HttpSendHttpResponse completing is all about threading.
Best case we'd disconnect the CancellationToken from Abort like in AspNetCore, but that's a risky change at this point.
AspNetCore would be vulnerable to this if you passed the CancellationToken to WriteAsync, however, you'd only ever see a TaskCancelledException, the HttpSys error would be supressed.
Final payload write intermittently fails with the error: System.IO.IOException ---> System.Net.HttpListenerException: The I/O operation has been aborted because of either a thread exit or an application request at System.Net.HttpResponseStream.EndWrite(IAsyncResult asyncResult) at System.IO.Stream.<>c.b__53_1(Stream stream, IAsyncResult asyncResult)
at System.Threading.Tasks.TaskFactoryd36.MoveNext()
--- End of inner exception stack trace ---
at Microsoft.Owin.Host.HttpListener.RequestProcessing.ExceptionFilterStream.d 36.MoveNext()
1.FromAsyncTrimPromise
1.Complete(TInstance thisRef, Func`3 endMethod, IAsyncResult asyncResult, Boolean requiresSynchronization) --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.Owin.Host.HttpListener.RequestProcessing.ExceptionFilterStream.when the following conditions are applied:
Repro:
OS: Windows 10/Window Server 2016 .NET Framework: 4.6.2+ Owin host: HttpListener
I assume it's somehow related to a native http.sys methods invocation inside DisconnectHandler