Closed bridger closed 4 years ago
Any updates on this one?
If we send a ping, I think we should ensure we get a pong back specifically. Right now the logic only checks to see if something has been read since the last ping.
Approach A: Always send a ping, only read "pongs" as resetting the timeout timer
This approach would make sure the channel is still sending and receiving sane data I think. We'd make sure that the other side is still responding to our messages, not just sending data. I don't know of any specific cases in production where a channel was still receiving data but was somehow broken for writes or something.
Would it make sense to reschedule the next ping (cancelling the currently scheduled one) whenever we update lastReadTime?
Approach B: Only send pings when no other data has been received. Any data received resets the timeout timer.
This is to make sure we don't send any unnecessary data. If anything has been received from the other side then we don't need to send a ping at all.
A ping/pong is quite small, so this might not be saving much bandwidth. Maybe the CPU cost of unscheduling and rescheduling would outweigh the bandwidth costs.
It might also make sense to have this take writes into consideration. Maybe we call this lastActivityTime or something and update it in the send method?
Approach C: Just make sure any data is read or written
I think we shouldn't count "we sent data" as resetting the timeout. We don't know if it is making it to the other side. This approach would fix issues with load balancers shutting down connections prematurely, but I don't think it would help with zombie connections.
I'm not sure whether to update this PR towards approach A or approach B.
Those are good points. What do you think about this logic:
This makes things pretty simple which I think is a plus.
That logic is very straightforward. I've updated the PR with that suggestion. Thank you for the review!
These changes are now available in 2.1.0
Adds a
pingInterval
property on WebSocket which will trigger automatic pings to the peer to ensure that the connection is still active (#68).