dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.27k stars 4.73k forks source link

UdpClient hangs when Receive is waiting and gets a Close() in macos #64551

Open jaimefinat opened 2 years ago

jaimefinat commented 2 years ago

Hi we are experiencing an issue in macos (Monterey) when closing a UdpClient socket from the server side. Also related to previously detected issue #21392

  1. We create the UdpClient: mUdpClient = new UdpClient(listenPort);
  2. The UdpClient calls Receive method: byte[] bytes = mUdpClient.Receive(ref groupEP);
  3. While waiting to receive something from any client, the Close() method is called in another thread: mUdpClient.Close();

Attached to this, I'm sending a minimum version of a Server-Client application to reproduce the issue.

Expected behaviour: In Windows it is rising an exception which allos us to stop the server:

System.Net.Sockets.SocketException (10004): A blocking operation was interrupted by a call to WSACancelBlockingCall.
   at System.Net.Sockets.Socket.ReceiveFrom(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags, EndPoint& remoteEP)
   at System.Net.Sockets.UdpClient.Receive(IPEndPoint& remoteEP)
   at Server.StartListener() in C:\experiments\client-server\server\Server.cs:line 21

In macos it hangs in Receive until it gets a message from the client, so it processes the request and then triggers another exception:

Unhandled exception. System.ObjectDisposedException: Cannot access a disposed object.
Object name: 'System.Net.Sockets.Socket'.
   at System.Net.Sockets.Socket.ThrowObjectDisposedException()
   at System.Net.Sockets.Socket.ReceiveFrom(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags, EndPoint& remoteEP)
   at System.Net.Sockets.UdpClient.Receive(IPEndPoint& remoteEP)
   at Server.StartListener() in /Users/daniel.hompanera/wkspaces/UdpApp/Server/Server.cs:line 21
   at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location ---
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.ThreadHelper.ThreadStart()
[1]    29462 abort      dotnet Server.dll

To reproduce the issue (different behaviour in macos/windows) download and build the solution from this post, start the server and type: efss to EndFromServerSide.

The code has been modified from the basic version here

Thank you very much for your kind support! Jaime

UdpApp.zip

ghost commented 2 years ago

Tagging subscribers to this area: @dotnet/ncl See info in area-owners.md if you want to be subscribed.

Issue Details
Hi we are experiencing an issue in macos (Monterey) when closing a UdpClient socket from the server side. Also related to previous detected issue [#21392](https://github.com/dotnet/runtime/issues/24513) 1. We create the UdpClient: `mUdpClient = new UdpClient(listenPort);` 2. The UdpClient calls Receive method: `byte[] bytes = mUdpClient.Receive(ref groupEP);` 3. While waiting to receive something from any client, the Close() method is called in another thread: mUdpClient.Close(); Attached to this, I'm sending a minimum version of a Server-Client application to reproduce the issue. Expected behaviour: In Windows it is rising an exception which allos us to stop the server: ``` System.Net.Sockets.SocketException (10004): A blocking operation was interrupted by a call to WSACancelBlockingCall. at System.Net.Sockets.Socket.ReceiveFrom(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags, EndPoint& remoteEP) at System.Net.Sockets.UdpClient.Receive(IPEndPoint& remoteEP) at Server.StartListener() in C:\experiments\client-server\server\Server.cs:line 21 ``` In macos it hangs in Receive until it receives a message from the client, so it process the request and then triggers another exception: ``` Unhandled exception. System.ObjectDisposedException: Cannot access a disposed object. Object name: 'System.Net.Sockets.Socket'. at System.Net.Sockets.Socket.ThrowObjectDisposedException() at System.Net.Sockets.Socket.ReceiveFrom(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags, EndPoint& remoteEP) at System.Net.Sockets.UdpClient.Receive(IPEndPoint& remoteEP) at Server.StartListener() in /Users/daniel.hompanera/wkspaces/UdpApp/Server/Server.cs:line 21 at System.Threading.ThreadHelper.ThreadStart_Context(Object state) at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state) --- End of stack trace from previous location --- at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state) at System.Threading.ThreadHelper.ThreadStart() [1] 29462 abort dotnet Server.dll ``` To reproduce the issue (different behaviour in macos/windows) download and build the solution from this post, start the server and type: _efss_ to EndFromServerSide. The code has been modified from the basic version [here](https://docs.microsoft.com/en-us/dotnet/framework/network-programming/using-udp-services) Thank you very much for your kind support! Jaime [UdpApp.zip](https://github.com/dotnet/runtime/files/7973222/UdpApp.zip)
Author: jaimefinat
Assignees: -
Labels: `area-System.Net.Sockets`, `untriaged`
Milestone: -
wfurt commented 2 years ago

may be related to dotnet/corefx#38804. I think Disconnect does not work on UDP. One way how to work around it is to use ReceiveAsync. cc: @tmds

jaimefinat commented 2 years ago

We are using a timeout, as the async alternative would imply some more refactoring and redesign. I hope it is also OK.

In case someone else gets into this thread, our workaround:

  1. Set the timeout property: UdpClient.Client.SetSocketOption(SocketOptionLevel.Socket, SocketOptionName.ReceiveTimeout, 500);
  2. Wrap inside a try-catch the Receive method
  3. Catch the SocketError exception and assure its ErrorCode is TimedOut:
    catch (SocketException ex)
    {
    // Attention, do not use int codes as they might be different between OS's:
    // https://blog.jetbrains.com/dotnet/2020/04/27/socket-error-codes-depend-runtime-operating-system/
    if (ex.SocketErrorCode == SocketError.TimedOut)
        continue;
  4. In case it is TimedOut, continue with the loop and Receive again :

Thank you for your quick response! Best, Jaime

wfurt commented 2 years ago

Understand. However, I'm not sure if there is any good fox for this case. You could use something like

using CancellationTokenSource cts = new CancellationTokenSource();
cts.CancelAfter(timeout);
var result = socket.ReceiveAsync(memory, flags, cts.Token).GetAwaiter().GetResult();

While this is generally not recommended it should avoid the hang. The exception would be different.

Actually digging deeper this seems like dup of https://github.com/dotnet/runtime/issues/47342

https://github.com/dotnet/runtime/blob/a2235e775b55efab0e195d2850f1b4bb12c8a4f1/src/libraries/System.Net.Sockets/tests/FunctionalTests/ReceiveFrom.cs#L174-L181

tmds commented 2 years ago

It's a known limitation that we're unable to unblock sync receives and TCP connects:

https://github.com/dotnet/runtime/blob/daf77fc9a5ee0417d691a19322cc0bc74d8562df/src/libraries/System.Net.Sockets/src/System/Net/Sockets/SafeSocketHandle.Unix.cs#L189-L190

You need to use an async API if you want to be able to cancel them.

wfurt commented 2 years ago

I was thinking about it more @tmds and I'm wondering if we can do something like:

 tmp = handle;
 handle = -1;
 close(tmp);

e.g. set handle to invalid to prevent future operations or mixup when the handle is closed but actually use close() to finalize all pending operations.

The FIN behavior does not matter for UDP and may be acceptable for TCP in corner cases. Still feels better than blocking forever.

tmds commented 2 years ago

On macOS, when there is a pending UDP receive, close does not cause it to return. It's not something we're causing, it is how the kernel behaves.

wfurt commented 2 years ago

I don't think that is true @tmds. Consider following example

using System;
using System.Net.Sockets;
using System.Threading;
using System.Threading.Tasks;
using System.Runtime.InteropServices;

namespace udpListener
{
    class Program
    {
        [DllImport("libc", EntryPoint = "close", SetLastError = true)]
        static extern internal int MyClose(IntPtr handle);

        static void Receive(Socket s)
        {
            try
            {
                byte[] b = new byte[10024];
                int received = s.Receive(b);
                Console.WriteLine("Received {0} bytes", received);
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex);
            }
        }
        static void Main(string[] args)
        {
            int listenPort = 9000;
            var mUdpClient = new UdpClient(listenPort);
            Console.WriteLine("Got listening socket on {0}", mUdpClient.Client.Handle);
            var t = Task.Run(() => Receive(mUdpClient.Client));
            Thread.Sleep(1000);
            Console.WriteLine("Cancelling");

            if (args.Length > 0)
            {
                MyClose(mUdpClient.Client.Handle);
            }
            else
            {
                mUdpClient.Client.Close();
            }

            t.GetAwaiter().GetResult();
            Console.WriteLine("All done");
        }
    }
}

It hangs when executed without parameters but

Shining:udpListener furt$ ~/dotnet-6/dotnet run a
Hello World!
Got listening socket on 33
Cancelling
System.Net.Sockets.SocketException (89): Operation canceled
   at System.Net.Sockets.Socket.Receive(Byte[] buffer)
   at udpListener.Program.Receive(Socket s) in /Users/furt/projects/udpListener/Program.cs:line 20
All done

I think it is OK to close the handle while used in p/invoke but we need to avoid case when it is closed but old value could be used. This is why I suggested to set the handle to invalid to prevent further use before closing the handle.

tmds commented 2 years ago

I don't think that is true

Yes, you're right. Calling close causes the recv to unblock.

This is why I suggested to set the handle to invalid to prevent further use before closing the handle.

This has a race. When the handle gets closed between adding the reference to the SafeHandle and passing the fd to the syscall, the fd passed to the syscall may no longer refer to the right socket.

wfurt commented 2 years ago

This has a race. When the handle gets closed between adding the reference to the SafeHandle and passing the fd to the syscall, the fd passed to the syscall may no longer refer to the right socket.

can we fix that somehow? I'm not sure if setting the handler to invalid is sufficient but it can be start.

tmds commented 2 years ago

can we fix that somehow? I'm not sure if setting the handler to invalid is sufficient but it can be start.

No. This is the race:

Thread A: takes ref on safe handle
Thread A: int fd = safehandle.DangerousGetHandle();
Thread B: close(fd)
Thread B: int s = socket(); // the raw handle 's' equals the 'fd'
Thread A: recvmsg(fd); // recvmsg was supposed to be called on safehandle, but it is called on 's' instead.
karelz commented 2 years ago

Triage: We don't see a way how to fix it. Given that this is the first report, we will keep it open for Future. The only option we thought about is to call close early, which would invalidate some additional assumptions on parallel calls.

Workaround is to use ReadAsync and wait for result - while it is not best practice, it may be decent alternative in this case.