deepstreamIO / deepstream.io

deepstream.io server
https://deepstreamio.github.io
MIT License
7.13k stars 382 forks source link

Updates stop being sent to client after a few minutes from connecting. #1100

Closed redlock closed 3 years ago

redlock commented 3 years ago

Hello, I have been getting complaints from users that the client stops receiving updates after a few minutes (around 4 minutes) from connecting. I have seen this happen once with me as well. It is difficult to replicate and happens to some clients but not all.

When I encountered it I opened another client to see if it the server is still sending updates and it was sending updates to the second client. There is no errors nor does the socket gets closed it just doesn't get any updates. Restarting the server seems to resolve the issue temporarily until it starts happening again.

I would appreciate if anyone can give me guidance on how to diagnose this bug. Since it is difficult to reproduce, I am having trouble debugging it.

Notes:

This is using the old protocol v3 with this docker image (deepstreamio/deepstream.io:5.1.8) for the server.

Thanks

redlock commented 3 years ago

I do notice seeing "INFO | Heartbeat missing on websocket, terminating connection" log message a lot more in the newer versions. I don't remember seeing them in older versions of the server.

Also, does socket.destroy() notifies the client that the socket is closed? Or does the client remains open waiting for updates from the server even though the socket is closed on the server side?

I am starting to think this heartbeat mechanism might be the cause behind this bug.

yasserf commented 3 years ago

hey!

Thanks for raising this. Unless you're using apple silicon can you see if you can replicate this with 5.1.5? After that most fixes are related to heartbeats and pre v4 versions don't have tests running against it.

redlock commented 3 years ago

I will try the older version of the server (pre 5.1.5) and see how it goes.

redlock commented 3 years ago

I made a quick test for JS client (deepstream.io-client-js ) connecting to the latest version (5.2.3). The client is using the text v3 protocol.

Here is the setup and what I have observed:

It does seems that this new heartbeat implementation is causing issues with older clients that don't have heartbeats or aren't sending pings.

I am not sure if this explains what is happening in production though. In production the updates stop after 4 or so minutes and only for some users. The clients in production are sending pongs to the server when it pings them.

One possibility is that setTimeout (used to send Pings to the client) is delayed when the server or NodeJS event loop gets busy. For example if we have thousands of updates per second that are being received by the server. The event loop would need to process them and also handle hundreds of setTimeout function for each connection. Could this make the Ping sent to the client by setTimeout delayed? Instead of the server sending the ping every 30 seconds it get delayed and by the time it gets trigged the heartbeat check is triggered and the socket is destroyed.

According to NodeJS docs setTimeout is not guaranteed to run in the time given (if the event loop is blocked for example).

jaime-ez commented 3 years ago

Hi, what's your heartbetInterval on server and on client? And versions for both?

redlock commented 3 years ago

Hello, The server heartbetInterval interval is the default (30000). For the client I am using the Java client v2.2.2 which doesn't have a setting for heartbetInterval. It does respond to pings from the server though. There server versions I noticed the issue with are 5.1.8 and higher.

Another possibility is that the Java client pong might be slow to reach the server if the socket is very busy sending large amount of packets (or a slow connection). This might be even more true for slower devices (Android). It might explain why some users experience it persistently while others don't.

jaime-ez commented 3 years ago

And which server are you using: node-http or uws ?

redlock commented 3 years ago

node-http

redlock commented 3 years ago

The bug is still occurring with older versions of the server (pre 5.1.5). I am guessing this rules out it's a heartbeat issue. It could be something with the Java client. I am seeing the client disconnect in the server log but no connection change event on the client side.

redlock commented 3 years ago

I will close the issue now. I don't believe it is server related.

yasserf commented 3 years ago

hey,

Reopening this as the problem is it will happen with all older SDKS, plus development on any prior to v4 sdks is not a thing anymore. Also, heartbeats are dealt with seperately in the test-protocol parser which means it can fixed in server.

@jaime-ez any thoughts?

jaime-ez commented 3 years ago

Well, on the server side the text protocol implementation should send a PING every heartbeatInterval, and the JAVA client should respond with a PONG. That incoming PONG message to the server should update the lastMessageReceivedAt timestamp and therefore prevent the connection from being closed from the server side. I don't see what could it be, maybe as done for the JS client, the JAVA client should send the PING message to server every heartbeatInterval to prevent this?....those are my 2 cents

yasserf commented 3 years ago

yeah i don't have any recollection over the java sdk setup, it was build to be identical to the v2 js client and it wasn't exactly a deterministic client (which did allow it to perform faster). I'll see if I can get a quick idea on the logic, should in theory be the same as the v2 js client.

yasserf commented 3 years ago

Edit: Sorry made an incorrect analysis.

@jaime-ez i seem to have come to the exact conclusion as you.

The text protocol sends the ping every x seconds

The java sdk should reply with a pong: https://github.com/deepstreamIO/deepstream.io-client-java/blob/master/src/main/java/io/deepstream/Connection.java#L269

The server then ensures we recieved any message (doesn't have to be a pong)

https://github.com/deepstreamIO/deepstream.io/blob/0175edbac33d5cb6d822240d80ae3096fe1ef861/src/services/http/node/node-http.ts#L138

Generally @redlock what you can try to do is just send a 'heartbeart' event (or any other useless even noone is subscribed to) and see if that still breaks. If it does it means it's the server not running the code properly:

https://github.com/deepstreamIO/deepstream.io/blob/0175edbac33d5cb6d822240d80ae3096fe1ef861/src/services/http/node/node-http.ts#L151

Hope that makes sense.

redlock commented 3 years ago

Hi, The javascript client (older versions) disconnects from the server after two heartbeats.

jaime-ez commented 3 years ago

Hi, what do you mean by older versions ? I can not reproduce this

The javascript client (older versions) disconnects from the server after two heartbeats.

using server "@deepstream/server": "5.2.3", and client "deepstream.io-client-js": "^2.3.4"

The server sends the ping and the client responds with pong.

yasserf commented 3 years ago

@redlock is it possible you have mistmatching heartbeat config?

redlock commented 3 years ago

I was using the /deepstream-v3 endpoint.

slachtar commented 3 years ago

Hi @redlock, Sorry for this out of topic question, I'm working on a new project where I need a native android app to connect to deepstream server, I've a deepstream server with no auth running on localhost (default setup), could you please show me the minimal code required in java to connect to deepstream server. Thank you,

jaime-ez commented 3 years ago

Hi everyone, I'm going to close this since it's not a server issue and there are no active maintainers for the JAVA sdk. Best of luck figuring it out!