srcbookdev / srcbook

TypeScript-centric app development platform: notebook and AI app builder
https://srcbook.com
Apache License 2.0
2.92k stars 171 forks source link

Re-establish websocket connection on disconnect #280

Open benjreinhart opened 2 months ago

benjreinhart commented 2 months ago

The client-side WebSocket client has some minimal retry logic. However, I do not believe it properly handles re-establishing connections if the server closes the connection and/or there's a network issue that severs the connection.

We should implement behavior such that a connection is re-established when:

  1. The server explicitly closes the connection
  2. The connection appears alive but the other end has disappeared (usually due to network issues).

The first should be straightforward, but the second may involve a heartbeat pattern where the client sends a message every N seconds (30?) and if it doesn't receive a reply it closes and reopens the connection. As a potential reference, I know the phoenix js client implements the heartbeat pattern.

We should be able to test this by killing and restarting the API server with an open browser tab (something that may happen regularly in dev once https://github.com/srcbookdev/srcbook/pull/277 is in). There's probably chrome dev tools that can help with this as well.

BeRecursive22 commented 2 months ago

@benjreinhart Thank you for your insights. I agree with your suggestions and would like to propose the following implementation strategy:

  1. Exponential Backoff for Reconnection:

    • Implement an exponential backoff strategy for reconnection attempts.
    • Add a configurable maxRetry limit to prevent indefinite retries.
    • Include a user-facing "Reconnect" button that becomes active once we've exhausted the automatic retry attempts.
  2. Heartbeat Mechanism:

    • Implement a "ping-pong" heartbeat system:
      • Client sends a 'ping' every X seconds (configurable).
      • Server should respond with a 'pong' within a specified timeout (accounting for typical latency).
    • If no 'pong' is received within the timeout, initiate the reconnection process. (make sure to clean up everything)
  3. Action/Data Buffer:

    • Implement a buffer to store actions/data that couldn't be sent due to connection issues.
    • Upon successful reconnection, attempt to resync this buffered data with the server.

I'd be happy to take on implementing these improvements. Let me know if you'd like me to start with any specific part.