dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.96k stars 4.65k forks source link

Unknown socket error #64305

Open sliderhouserules opened 2 years ago

sliderhouserules commented 2 years ago

Description

We have an application based on template dotnet create webapi project template and just adds an IHostedService to do background processing. Kestrel is used in order to be able to answer health checks from Kubernetes, as well as feed new requests into the service to be queued. Our background processing logic involves pretty heavy interaction with AWS resources (SQS, S3, STS), and we are encountering SocketExceptions that have been very difficult to pin down.

The canon guidance to use a static HttpClient instance application-wide does not work well when using the AWS clients in Amazon's SDK packages. The various AWS client constructors, and the config objects you can hand in, do not accept an instance of HttpClient. Instead, you must either rely on their built-in caching of HttpClient instance(s) that are created internally, or you have to provide an implementation that derives from their HttpClientFactory. This is fairly straight-forward if you use IHttpClientFactory in .Net. The Create() method override that you implement simply calls .Create() on an instance of IHttpClientFactory provided via DI.

However, even when providing this HttpClientFactory to all instances of any AWS client that is instantiated anywhere and everywhere in this service, it does not solve the SocketException problem. I will provide the stack trace below, but the properties of the actual SocketException object are extremely unhelpful:

Unknown socket error; ErrorCode: -131074; SocketErrorCode: SocketError; NativeErrorCode: -131074

The twist here is that these errors have only been encountered on Linux/in containers. Extended running tests on Windows cannot reproduce the problem when tested under load.

The other twist is that our Kestrel calls are producing System.InvalidOperationException: Handle is already used by another Socket. errors (they are logged as Warning level) in Kestrel's handling pipeline. It would appear that Kestrel, and the HttpWebRequest pool used underneath IHttpClientFactory are stomping on each other. If I am mischaracterizing the issue and there is a different cause, please let me know.

The fact that the runtime is throwing a SocketException with an error code that isn't even in the spec for TCP errors (nice article here) is the reason I am bringing it to this group. The runtime is burping up what would appear to be the equivalent of the default case in a switch statement.

Any help or guidance here is appreciated.

Reproduction Steps

The basic structure is a Web API project created with dotnet create webapi with an IHostedService doing background processing interacting heavily with AWS resources. I can provide application code privately, as needed, so as to not have to share proprietary/private logic and access keys publicly.

Expected behavior

Interaction with AWS resources from a web application should not cause SocketExceptions.

Actual behavior

We are encountering SocketExceptions that have error codes that aren't even part of the TCP specs:

Unknown socket error; ErrorCode: -131074; SocketErrorCode: SocketError; NativeErrorCode: -131074

Full stack trace:

System.Net.Http.HttpRequestException: Unknown socket error (sqs.us-west-2.amazonaws.com:443)
 ---> System.Net.Sockets.SocketException (0xFFFDFFFE): Unknown socket error
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
   at System.Net.Sockets.Socket.<ConnectAsync>g__WaitForConnectWithCancellation|277_0(AwaitableSocketAsyncEventArgs saea, ValueTask connectTask, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.ConnectToTcpHostAsync(String host, Int32 port, HttpRequestMessage initialRequest, Boolean async, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.HttpConnectionPool.ConnectToTcpHostAsync(String host, Int32 port, HttpRequestMessage initialRequest, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.ConnectAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.CreateHttp11ConnectionAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.AddHttp11ConnectionAsync(HttpRequestMessage request)
   at System.Threading.Tasks.TaskCompletionSourceWithCancellation1.WaitWithCancellationAsync(CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.GetHttp11ConnectionAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at Microsoft.Extensions.Http.Logging.LoggingHttpMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at Microsoft.Extensions.Http.PolicyHttpMessageHandler.SendCoreAsync(HttpRequestMessage request, Context context, CancellationToken cancellationToken)
   at Polly.Retry.AsyncRetryEngine.ImplementationAsync[TResult](Func3 action, Context context, CancellationToken cancellationToken, ExceptionPredicates shouldRetryExceptionPredicates, ResultPredicates1 shouldRetryResultPredicates, Func5 onRetryAsync, Int32 permittedRetryCount, IEnumerable1 sleepDurationsEnumerable, Func4 sleepDurationProvider, Boolean continueOnCapturedContext)
   at Polly.AsyncPolicy1.ExecuteAsync(Func3 action, Context context, CancellationToken cancellationToken, Boolean continueOnCapturedContext)
   at Microsoft.Extensions.Http.PolicyHttpMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at Microsoft.Extensions.Http.Logging.LoggingScopeHttpMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
   at Amazon.Runtime.HttpWebRequestMessage.GetResponseAsync(CancellationToken cancellationToken)
   at Amazon.Runtime.Internal.HttpHandler1.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.Unmarshaller.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.SQS.Internal.ValidationResponseHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.ErrorHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.ErrorHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.CallbackHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.EndpointDiscoveryHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.EndpointDiscoveryHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.CredentialsRetriever.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.RetryHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.RetryHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.CallbackHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.CallbackHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.ErrorCallbackHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.MetricsHandler.InvokeAsync[T](IExecutionContext executionContext)
   at <our code starts here>

Regression?

The application in its current form is a .Net 6 solution. In a prior version, it was .Net 5 and the SocketExceptions were very rare if not non-existent.

However, even in the .Net 5 solution Kestrel was throwing the System.InvalidOperationException: Handle is already used by another Socket. warnings when handling basic health check calls.

Known Workarounds

We have implemented heavy retry policies using Polly, which allows the application to move past the errors, but it quite often takes a near complete restart of the background processing logic to clear up the SocketExceptions and allow the application to restart a job and create new connections to the AWS resources being used.

Configuration

The errors only appear running in containers on Linux.

Other information

N/A

stephentoub commented 2 years ago

But on Windows presumably that can't happen? Since this code ran on Windows fine for years.

A 0 handle is invalid on Windows, so this was effectively a nop.

a warning if initializing a SafeHandle to a 0 value

It's valid to create such a SafeFileHandle, e.g. if someone wanted to use it to read from stdin.

MMOSimca commented 2 years ago

Is it a rare enough use case that it is still worth doing a warning and having people who are doing it just add the warning ignore stuff around it to remove that?

I don't want to inconvenience people but this cost me months. I'm willing to accept that the real problem was PEBKAC, but surely I cannot be alone in experiencing this (not sure yet if @sliderhouserules has the same issue or not).

Why did Kestrel go silent and stop responding as a result (no input streams?)? Is there an improvement to be made there?

stephentoub commented 2 years ago

Is it a rare enough use case that it is still worth doing a warning and having people who are doing it just add the warning ignore stuff around it to remove that?

No: https://grep.app/search?q=new%20SafeFileHandle%28IntPtr.Zero

sliderhouserules commented 2 years ago

I'm not sure where I would put this... around every component that calls AWS? In my Startup? In my HttpClientFactory that the AWS clients use? Ugh. I don't think this will solve my Kestrel problems but I could be wrong. I'll see what I can sneak in to get this tested.

stephentoub commented 2 years ago

I'm not sure where I would put this

Where you would put what? @MMOSimca's app had bogus code that they deleted to solve the problem.

sliderhouserules commented 2 years ago

Oh, I thought they put it in to solve it, and the diff was just a different revision. I don't think this helps me much, as I wasn't using anything like this. Oh well.

MMOSimca commented 2 years ago

It's possible that your Kestrel is also being similarly starved of input streams, though. I hadn't really thought of the problem in that way before. Maybe something is wrong in one of your Dispose functions too?

Sucks that it wasn't the same issue for you.

Do you have any ideas about how we could make the issue I encountered less likely to cause bad effects for users, @stephentoub? Or is this just a situation where doing something valid in some contexts turned disasterous in another and the results are inevitable.

It does seem kind of disingenuous that playing with a 'Safe'Handle was possibly the least safe thing that could happen, but you did mention this is very old legacy naming.

stephentoub commented 2 years ago

Or is this just a situation where doing something valid in some contexts turned disasterous in another and the results are inevitable.

It is dutifully doing what you told it to do. You wanted to protect the specified file descriptor, such that if you neglected to close it, it would be closed for you to prevent a leak, and such that if you were actively using it while someone else tried to close it, you'd be protected from it being closed out from under you. And you specified a valid file descriptor value. It just so happens that you provided it a file descriptor you didn't mean to.

MMOSimca commented 2 years ago

At the very least, the changes made in the DLL provided by @tmds were fantastic and helped us narrow down the error much more closely. It would be great to add them to .NET.

I wonder if any of those socket error catches (I think he changed 3 places to expose stack trace) are unique to my problem. We could maybe warn people in the exception message that they might have terminated stdin?

roberthannon commented 2 years ago

We were having the exact same issues using Linux Fargate nodes on EKS - 502s, socket exceptions, corrupt responses. We managed to fix them by following the suggestion by MMOSimca to remove code which was creating and disposing SafeFileHandle(IntPtr.Zero) instances.

I assume the code was added by someone copy-pasting the example code shown in the IDisposable documenatation.

The concerning thing is the effect this had. We were seeing some of our logs being written into the HTTP response stream from Kestrel. The log messages are supposed to be written into a local file. Here's an example response payload which contains a log message before the HTTP headers: image

This hasn't happened since we made the code fixes. Not sure if this a sign of a deeper issue, or just our fault for misusing dangerous code... Anyway, big thanks to MMOSimca!

MMOSimca commented 2 years ago

Wow, thanks for finding the source of the problem! (I'm 100% sure this is what we modeled this after, causing the issue.)

Based on my understanding of SafeFileHandle from this thread, those code examples for IDisposable seem shockingly disasterous. Why would regular IDisposable handlers have SafeFileHandle(0) calls?

It's especially concerning that the page doesn't even mention this at all. Or the fact that the code does nothing at all on Windows, which was presumably the primary OS when this was originally written.

Is it a rare enough use case that it is still worth doing a warning and having people who are doing it just add the warning ignore stuff around it to remove that?

No: https://grep.app/search?q=new%20SafeFileHandle%28IntPtr.Zero

Out of these 21 results, 8 of them seem to be instances of this bug based on that flawed IDisposable example code.

There also seems to be a link between flawed use of this code and the second parameter (the boolean). Everyone intentionally using it because they know what they're doing is setting that second parameter to false, despite what the docs say.

All the people copying from the IDisposable example are setting it to true. Could we warn on these criteria (first parameter = 0, second parameter = true)?

roberthannon commented 2 years ago

I could be wrong, but I feel like the docs are just using the SafeFileHandle as an example of how to dispose "a handle". You should never actually use new SafeFileHandle(IntPtr.Zero, true);. It makes no sense - it's a no-op on Windows and triggers armageddon on Linux.

stephentoub commented 2 years ago

I feel like the docs are just using the SafeFileHandle as an example of how to dispose "a handle". You should never actually use new SafeFileHandle(IntPtr.Zero, true);. It makes no sense - it's a no-op on Windows and triggers armageddon on Linux.

Yes, the docs should be improved here. The intent I'm sure was to highlight how an IDisposable that owns a resource (a safe handle) can dispose of it. That should be made more clear, and an easy fix to the code (beyond comments that make it clear you shouldn't actually copy that SafeFileHandle into your own code) is to change the 0 to a -1.

The docs are open source. Contributions to improve them are welcome.

MMOSimca commented 2 years ago

@stephentoub Is there anything further we should do about the fact that Kestrel was writing requests with the wrong output stream when all of its input streams were closed?

That sounds like a security concern. An extremely obscure security concern to be sure, but probably something that warrants its own thread, right?

Plus, now that we know how to trigger it, it should be very easy to replicate in a simple PoC.

I'd also like it if Kestrel had said -anything- about the situation, even when connection logging was enabled or debugging. Throwing an exception seems okay for Kestrel to do in a situation as bizarre as this, but smarter minds than I will probably know whether or not that could negatively impact some people who were like... trying intentionally to use Kestrel without stdin.

stephentoub commented 2 years ago

Is there anything further we should do about the fact that Kestrel was writing requests with the wrong output stream when all of its input streams were closed?

@halter73?

ghost commented 2 years ago

Tagging subscribers to this area: @dotnet/area-system-runtime See info in area-owners.md if you want to be subscribed.

Issue Details
### Description We have an application based on template _dotnet create webapi_ project template and just adds an IHostedService to do background processing. Kestrel is used in order to be able to answer health checks from Kubernetes, as well as feed new requests into the service to be queued. Our background processing logic involves pretty heavy interaction with AWS resources (SQS, S3, STS), and we are encountering SocketExceptions that have been very difficult to pin down. The canon guidance to use a static HttpClient instance application-wide does not work well when using the AWS clients in Amazon's SDK packages. The various AWS client constructors, and the config objects you can hand in, do not accept an instance of HttpClient. Instead, you must either rely on their built-in caching of HttpClient instance(s) that are created internally, or you have to provide an implementation that derives from their [HttpClientFactory](https://github.com/aws/aws-sdk-net/blob/master/sdk/src/Core/Amazon.Runtime/Pipeline/HttpHandler/_netstandard/HttpRequestMessageFactory.cs#L38). This is fairly straight-forward if you use [IHttpClientFactory](https://docs.microsoft.com/en-us/aspnet/core/fundamentals/http-requests?view=aspnetcore-6.0) in .Net. The Create() method override that you implement simply calls .Create() on an instance of IHttpClientFactory provided via DI. However, even when providing this HttpClientFactory to all instances of any AWS client that is instantiated anywhere and everywhere in this service, it does not solve the SocketException problem. I will provide the stack trace below, but the properties of the actual SocketException object are extremely unhelpful: `Unknown socket error; ErrorCode: -131074; SocketErrorCode: SocketError; NativeErrorCode: -131074` The twist here is that these errors have only been encountered on Linux/in containers. Extended running tests on Windows cannot reproduce the problem when tested under load. The other twist is that our Kestrel calls are producing `System.InvalidOperationException: Handle is already used by another Socket.` errors (they are logged as Warning level) in Kestrel's handling pipeline. It would appear that Kestrel, and the HttpWebRequest pool used underneath IHttpClientFactory are stomping on each other. If I am mischaracterizing the issue and there is a different cause, please let me know. The fact that the runtime is throwing a SocketException with an error code that isn't even in the spec for TCP errors (nice article [here](https://blog.jetbrains.com/dotnet/2020/04/27/socket-error-codes-depend-runtime-operating-system/)) is the reason I am bringing it to this group. The runtime is burping up what would appear to be the equivalent of the `default` case in a `switch` statement. Any help or guidance here is appreciated. ### Reproduction Steps The basic structure is a Web API project created with _dotnet create webapi_ with an `IHostedService` doing background processing interacting heavily with AWS resources. I can provide application code privately, as needed, so as to not have to share proprietary/private logic and access keys publicly. ### Expected behavior Interaction with AWS resources from a web application should not cause SocketExceptions. ### Actual behavior We are encountering SocketExceptions that have error codes that aren't even part of the TCP specs: `Unknown socket error; ErrorCode: -131074; SocketErrorCode: SocketError; NativeErrorCode: -131074` Full stack trace: ``` System.Net.Http.HttpRequestException: Unknown socket error (sqs.us-west-2.amazonaws.com:443) ---> System.Net.Sockets.SocketException (0xFFFDFFFE): Unknown socket error at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token) at System.Net.Sockets.Socket.g__WaitForConnectWithCancellation|277_0(AwaitableSocketAsyncEventArgs saea, ValueTask connectTask, CancellationToken cancellationToken) at System.Net.Http.HttpConnectionPool.ConnectToTcpHostAsync(String host, Int32 port, HttpRequestMessage initialRequest, Boolean async, CancellationToken cancellationToken) --- End of inner exception stack trace --- at System.Net.Http.HttpConnectionPool.ConnectToTcpHostAsync(String host, Int32 port, HttpRequestMessage initialRequest, Boolean async, CancellationToken cancellationToken) at System.Net.Http.HttpConnectionPool.ConnectAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken) at System.Net.Http.HttpConnectionPool.CreateHttp11ConnectionAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken) at System.Net.Http.HttpConnectionPool.AddHttp11ConnectionAsync(HttpRequestMessage request) at System.Threading.Tasks.TaskCompletionSourceWithCancellation1.WaitWithCancellationAsync(CancellationToken cancellationToken) at System.Net.Http.HttpConnectionPool.GetHttp11ConnectionAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken) at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken) at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken) at Microsoft.Extensions.Http.Logging.LoggingHttpMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) at Microsoft.Extensions.Http.PolicyHttpMessageHandler.SendCoreAsync(HttpRequestMessage request, Context context, CancellationToken cancellationToken) at Polly.Retry.AsyncRetryEngine.ImplementationAsync[TResult](Func3 action, Context context, CancellationToken cancellationToken, ExceptionPredicates shouldRetryExceptionPredicates, ResultPredicates1 shouldRetryResultPredicates, Func5 onRetryAsync, Int32 permittedRetryCount, IEnumerable1 sleepDurationsEnumerable, Func4 sleepDurationProvider, Boolean continueOnCapturedContext) at Polly.AsyncPolicy1.ExecuteAsync(Func3 action, Context context, CancellationToken cancellationToken, Boolean continueOnCapturedContext) at Microsoft.Extensions.Http.PolicyHttpMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) at Microsoft.Extensions.Http.Logging.LoggingScopeHttpMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) at System.Net.Http.HttpClient.g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken) at Amazon.Runtime.HttpWebRequestMessage.GetResponseAsync(CancellationToken cancellationToken) at Amazon.Runtime.Internal.HttpHandler1.InvokeAsync[T](IExecutionContext executionContext) at Amazon.Runtime.Internal.Unmarshaller.InvokeAsync[T](IExecutionContext executionContext) at Amazon.SQS.Internal.ValidationResponseHandler.InvokeAsync[T](IExecutionContext executionContext) at Amazon.Runtime.Internal.ErrorHandler.InvokeAsync[T](IExecutionContext executionContext) at Amazon.Runtime.Internal.ErrorHandler.InvokeAsync[T](IExecutionContext executionContext) at Amazon.Runtime.Internal.CallbackHandler.InvokeAsync[T](IExecutionContext executionContext) at Amazon.Runtime.Internal.EndpointDiscoveryHandler.InvokeAsync[T](IExecutionContext executionContext) at Amazon.Runtime.Internal.EndpointDiscoveryHandler.InvokeAsync[T](IExecutionContext executionContext) at Amazon.Runtime.Internal.CredentialsRetriever.InvokeAsync[T](IExecutionContext executionContext) at Amazon.Runtime.Internal.RetryHandler.InvokeAsync[T](IExecutionContext executionContext) at Amazon.Runtime.Internal.RetryHandler.InvokeAsync[T](IExecutionContext executionContext) at Amazon.Runtime.Internal.CallbackHandler.InvokeAsync[T](IExecutionContext executionContext) at Amazon.Runtime.Internal.CallbackHandler.InvokeAsync[T](IExecutionContext executionContext) at Amazon.Runtime.Internal.ErrorCallbackHandler.InvokeAsync[T](IExecutionContext executionContext) at Amazon.Runtime.Internal.MetricsHandler.InvokeAsync[T](IExecutionContext executionContext) at ``` ### Regression? The application in its current form is a .Net 6 solution. In a prior version, it was .Net 5 and the SocketExceptions were very rare if not non-existent. However, even in the .Net 5 solution Kestrel was throwing the `System.InvalidOperationException: Handle is already used by another Socket.` warnings when handling basic health check calls. ### Known Workarounds We have implemented heavy retry policies using Polly, which allows the application to move past the errors, but it quite often takes a near complete restart of the background processing logic to clear up the SocketExceptions and allow the application to restart a job and create new connections to the AWS resources being used. ### Configuration - .NET 6. - Ubuntu 20.04 via the [base docker images](https://hub.docker.com/_/microsoft-dotnet-aspnet/) provided directly by Microsoft - x64, Kubernetes in AWS EKS The errors only appear running in containers on Linux. ### Other information N/A
Author: sliderhouserules
Assignees: -
Labels: `documentation`, `area-System.Runtime`
Milestone: 7.0.0
MMOSimca commented 2 years ago

Also, is there a use case for SafeFileHandle(IntPtr.Zero, true)? You established there's one for SafeFileHandle(IntPtr.Zero, false).

Given how much the questionable documentation has spread around, it would be great to have a code solution. I'd be happy to write something, but it does sound like based on this conversation that it would just be rejected.

tmds commented 2 years ago

Is there anything further we should do about the fact that Kestrel was writing requests with the wrong output stream when all of its input streams were closed?

If it is caused by SafeFileHandle(IntPtr.Zero, true): no. You musn't set ownsHandle to true for file descriptors you don't own.

Also, is there a use case for SafeFileHandle(IntPtr.Zero, true)? You established there's one for SafeFileHandle(IntPtr.Zero, false).

In practice, I don't think so.

Given how much the questionable documentation has spread around, it would be great to have a code solution.

You mean a better example for the IDisposable documentation?

The handle = new SafeFileHandle(IntPtr.Zero, true) is meant to represent an IDisposable that gets disposed under if (disposing). It could be replace by something else that is disposable. It doesn't even need to be a SafeHandle. It could for example be fs = File.OpenWrite(Path.GetTempFileName()).

MMOSimca commented 2 years ago

Given how much the questionable documentation has spread around, it would be great to have a code solution.

You mean a better example for the IDisposable documentation?

The handle = new SafeFileHandle(IntPtr.Zero, true) is meant to represent an IDisposable that gets disposed under if (disposing). It could be replace by something else that is disposable. It doesn't even need to be a SafeHandle. It could for example be fs = File.OpenWrite(Path.GetTempFileName()).

I mean in the constructor for SafeFileHandle, if the first parameter is 0 and the second parameter is true, throw an exception. Or can you do warnings like that (based on parameters)? I'm looking through private repositories I have access to and I see this in 1 other unrelated project, which is not great.

I agree with you that the documentation for IDisposable shouldn't mention SafeFileHandle at all. Very dangerous function calls don't have a place anywhere near a 'best practices how to implement IDisposable tutorial'. I see it copied around the internet to other places on how to implement the IDisposable pattern, and it's like they spread around a kind of knowledge infection where they are instead teaching how to ruin your program on Linux. The irony level for a code example teaching best practices is through the roof. I'm going around to a few of these places and pointing it out.

Considering that the 2nd example doesn't even contain a disposable object (it's just the code needed to dispose one if one did exist), I don't know that you need to have any representation of a disposable object at all rather than like a comment saying that this is where you'd place one. Do you have a preference? Is it standard to include a representative example line in docs like this or just use a comment (or nothing)?

tmds commented 2 years ago

Now that we've identified new SafeFileHandle(IntPtr.Zero, true); as a source of this problem. Is there still an open issue that needs to be investigated further? Or can this issue be closed?

sliderhouserules commented 2 years ago

The SafeFileHandle stuff is not my issue, no. We are using a third-party document processing engine that does use unmanaged code, and I do need to verify 100% that they are not actually doing this. But until then, we need to consider that my issue is caused by something else and I still need to do a thorough investigation. My time at work has not been able to put this up in the priority stack high enough to get to that just yet. I apologize. Can we please leave this issue open a bit longer?

tmds commented 2 years ago

Yes, we'll keep it open until you had some time to investigate and check whether the root cause is also the IntPtr.Zero SafeFileHandle.

SpringHgui commented 3 months ago

The problem remains in .net6

antonfirsov commented 3 months ago

@SpringHgui is it possible that your code or a library you rely on instantiates a new SafeFileHandle(IntPtr.Zero, true) somewhere? For example see https://github.com/fabian-blum/AspNetCore.Identity.LiteDB/pull/14 and https://github.com/Marilyth/MopsBot-2.0/pull/74

SpringHgui commented 3 months ago

The problem remains in .net8

wfurt commented 3 months ago

do you have repro you can share @SpringHgui ? The linked repo above has way too much stuff. Alternatively can you do the strace or get detailed diagnostic logs?

SpringHgui commented 3 months ago

do you have repro you can share @SpringHgui ? The linked repo above has way too much stuff. Alternatively can you do the strace or get detailed diagnostic logs?

I encountered this error because I used the websocket-sharp library, in this library I reported an error, the startup project was upgraded from .net6 to .net8, it still exists, and the specific location is not located at present, and the windwos server is replaced, and this problem no longer occurs

icefire1 commented 1 month ago

I encountered this socket error when working on a MAUI app. For me the error only happened on some old Android devices and not consistently. What seemed to fix it for me was to explicitly close my socket instead of relying on dispose being called for me.

Broken code

private static async Task<bool> ScanPort(string ipAddress, int port, TimeSpan timeout, CancellationToken token)
{
    using var tcpClient = new TcpClient();
    using var cts = CancellationTokenSource.CreateLinkedTokenSource(token);
    cts.CancelAfter(timeout);

    await tcpClient.ConnectAsync(ipAddress, port, cts.Token);
    return tcpClient.Connected;
}

Working code

private static async Task<bool> ScanPort(string ipAddress, int port, TimeSpan timeout, CancellationToken token)
{
    using var cts = CancellationTokenSource.CreateLinkedTokenSource(token);
    cts.CancelAfter(timeout);

    using var tcpClient = new TcpClient();
    await tcpClient.ConnectAsync(ipAddress, port, cts.Token);
    var didConnect = tcpClient.Connected;
    tcpClient.Close();
    return didConnect;
}
wfurt commented 1 month ago

interesting. how many destinations are you trying to scan @icefire1 ? I'm wondering if you run out of max open file descriptors in this case...

icefire1 commented 1 month ago

@wfurt difficult to say exactly as it happened inconsistently, but I would guess somewhere around 10-40 scans over a duration of a few seconds to a few minutes.

Perhaps it's because too many sockets where opened at once. I would add, however, that I cannot replicate it by simply opening a lot of socket like so:

var clients = new List<TcpClient>();
for (int i = 0; i < 100000; i++)
{
    clients.Add(new TcpClient());
}

Above code would indeed give me an exception, but it would be ErrorCode 23 instead. With a message "File table overflow". I tested the sample code on both a new and an old device (which gave me the unknown socket error).

wfurt commented 1 month ago

Makes me wonder if it is related to the CancellationTokenSource.CreateLinkedTokenSource and cancellation in general.