kerryjiang / WebSocket4Net

A popular .NET WebSocket Client
Apache License 2.0
764 stars 273 forks source link

Packet loss points at resources not being disposed correctly internally #154

Open shinkathe opened 5 years ago

shinkathe commented 5 years ago

WebSocket4Net 0.15.2 .NET Framework 4.5.2 Windows 10

We have tested Websocket4Net extensively now with simulated packet loss, and we have repeatedly been able to reproduce creating a StackOverflow exception within Websocket4net which causes our application to crash.

We used https://jagt.github.io/clumsy/ to simulate a 95% packet loss and had code roughly like this to test connection retrying:

private void MonitorConnection()
{
    if (_websocket.State != WebSocketState.Connecting && _websocket.State != WebSocketState.Open)
    {
        _websocket.Open();
    }
}

public WebsocketNotifier(Action notified, string url)
{
    // Note that the URL we used, used HTTPS
    _websocket = new WebSocket(url)
    {
        AutoSendPingInterval = 2000,
        EnableAutoSendPing = true
    };

    _websocket.Security.AllowUnstrustedCertificate = true;

    _websocket.Opened += (sender, args) => _log.Info("Websocket notifier connection opened.");
    _websocket.Error += (sender, args) => _log.Error("Websocket notifier error occurred.");
    _websocket.Closed += (sender, args) => _log.Info("Websocket notifier closed occurred.");
    _websocket.MessageReceived += (sender, args) =>
    {
        _log.Info("Websocket notified: " + url);
        notified();
    };

    connectionMonitorTimer = new Timer((state) => MonitorConnection(), null, 0, 2000);
}

With code like this and packet loss at a high rate, we have observed that _websocket.Open(); call will begin erroring out after some time of use, and it will go as follows:

3 minutes of:

System.Net.Sockets.SocketException (0x80004005): A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond

This seems fine so far.

Then the error message will eventually turn into this:

System.IO.IOException: Unable to read data from the transport connection: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. ---> System.Net.Sockets.SocketException: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
  at System.Net.Sockets.Socket.EndReceive(IAsyncResult asyncResult)
  at System.Net.Sockets.NetworkStream.EndRead(IAsyncResult asyncResult)
  --- End of inner exception stack trace ---
  at System.Net.Security.SslState.InternalEndProcessAuthentication(LazyAsyncResult lazyResult)
  at System.Net.Security.SslState.EndProcessAuthentication(IAsyncResult result)
  at SuperSocket.ClientEngine.SslStreamTcpSession.OnAuthenticated(IAsyncResult result)

And immediately after that exception, we will only be receiving this exception:

System.Net.Sockets.SocketException (0x80004005): Only one usage of each socket address (protocol/network address/port) is normally permitted
  at System.Net.Sockets.Socket.DoBind(EndPoint endPointSnapshot, SocketAddress socketAddress)
  at System.Net.Sockets.Socket.Bind(EndPoint localEP)
  at SuperSocket.ClientEngine.ConnectAsyncExtension.ConnectAsync(EndPoint remoteEndPoint, EndPoint localEndPoint, ConnectedCallback callback, Object state)

And exactly after 15 minutes, or 430 retries, the application will crash because of a stack overflow exception deep within the native code of this library.

Now, we have fixed this problem by adding a dispose call manually before each Open call, but this clearly points to a memory management issue inside the library that should be fixed. As per our observations, it is not safe to use the same instance of WebSocket again, and repeated calls to Open are not safe.