Kitura / Kitura-WebSocket-NIO

A SwiftNIO based implementation of WebSocket for Kitura
Apache License 2.0
18 stars 13 forks source link

Not receiving all disconnected notifications #38

Open bridger opened 5 years ago

bridger commented 5 years ago

I'm having trouble cleaning up old connections in my application server. I investigated and saw that it isn't receiving the disconnected callback (defined on WebSocketService.swift) for each connection.

I tried refreshing a browser a lot so there are lots of short lived connections. I got 81 callbacks saying there was a new connection, but only 41 callbacks saying a connection had closed.

These connections probably weren't shut down properly client-side, so the server might not know they are dead. I had a ping handler set up to clean up dead connections, but it looks like that API is only in the non-NIO version of websockets. Can we port over that API? https://github.com/IBM-Swift/Kitura-WebSocket/pull/48

bridger commented 5 years ago

@pushkarnk On another issue you said

I tested a very simple scenario of handful browser tabs connecting to a ws server and disconnecting. I could see the disconnected(connection:reason) callback being invoked. I'll wait for more details what you observed.

I am seeing disconnected called most of the time, just not every time. The way I measured it was by putting a log on each connect and disconnect. I refreshed the browser a bunch of times and then I counted how many messages for connected and disconnected were in the log.

If you want to share the code for your scenario I can see if I can repro it with your setup, or if it is specific to my application.

Either way, I think the pinging feature would be super useful. It is often the case that a connection dies silently without the client sending a "goingAway" message. The only way I know to detect those ghost connections is by pinging them to test if they are alive.

pushkarnk commented 5 years ago

Sorry, I had posted something related to another issue here. Deleted it. Please ignore the notifications, if any!

pushkarnk commented 5 years ago

@bridger I wasn't able to repro what you said with 10-12 tabs. So, I wrote a JS client that makes N connections to the test WebSocket server, sends two messages and closes one connection per second.

If I make more than 20 connections, I see that the disconnected() callback hasn't been invoked on a handful of them. The number of misses is, of course, random.

You are right. This needs investigation.

pushkarnk commented 5 years ago

I also found a new Kitura-NIO bug testing with the JS client: https://github.com/IBM-Swift/Kitura-NIO/pull/215

pushkarnk commented 5 years ago

Ported IBM-Swift/Kitura-WebSocket#48 through #40

pushkarnk commented 5 years ago

I think I have found the reason for the WebSocket connection not receiving all the disconnected notifications. It is a bug in Kitura-NIO. Here's the possible fix: https://github.com/IBM-Swift/Kitura-NIO/pull/217

I ran a WebSocket client with 500 concurrent connections and closing them one connection per second. We seem to be getting the disconnected callback invoked on all of the connections.

bridger commented 5 years ago

Today I tried updating everything (Kitura-WebSocket-NIO was on commit https://github.com/IBM-Swift/Kitura-WebSocket-NIO/commit/b98bb3e72c6ff7fb4fd084985cdf427dfa2fd7f0 and Kitura-NIO was at version 2.2.0). I am unfortunately still seeing some connections never get disconnected. I'm afraid this would cause memory leaks that would crash the server pretty often. If there are any other tests I can run, let me know.

weissi commented 5 years ago

@bridger to help the Kitura folks debug, could you check if NIO sends a channelInactive event?

bridger commented 5 years ago

@weissi helped look into this. It looks like Kitura doesn’t implement the channelInactive or errorCaught callbacks. Without these, we’d only get notified if the channel closes cleanly. If the tcp channel is interrupted then Kitura won’t know about it. The solution is likely implementing both of those callbacks and then notifying about the close.

Sent with GitHawk

pushkarnk commented 5 years ago

Thanks @weissi and @bridger for the investigation - Yes, we should implement those callbacks. I'll take that up (Sorry for the late response, I was away yesterday).