pubkey / rxdb

A fast, local first, reactive Database for JavaScript Applications https://rxdb.info/
https://rxdb.info/
Apache License 2.0
21.61k stars 1.06k forks source link

Websocket hangs after connection is closed by the server #6515

Open brunovinicius opened 3 weeks ago

brunovinicius commented 3 weeks ago

GraphQL Websocket Plugin hangs and does not retry connection when server closes connection right after a successful 101 Upgrade. On my scenario this manifests itself as the client try to resume conection after refreshing auth token. WS connection is restablished way before the token is refreshed so it tries to connect using the old, expired token. After this, it gets stuck in this limbo as the server closes connection as soon as it validates the token.

Sequence of events looks similar to this:

=> GET wss://... <= 101 Switching Protocols => WS Message: connection_init (with the expired token) <= Server Closes connection

It seems like the client its not expecting this behavior and stays hanging in wait for the server to respond "connection_ack" which will never come. This scenario might arise due to a unfortunately timed network issue, but any directions on how my server could deal with this appropreately would be very much appreciated.

RxDB Version: 15.17

Thanks in advance and sorry for not providing a PR with a test case right away.

stale[bot] commented 2 weeks ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed soon. Please update it or it may be closed to keep our repository organized. The best way is to add some more information or make a pull request with a test case. Also you might get help in fixing it at the RxDB Community Chat If you know you will continue working on this, just write any message to the issue (like "ping") to remove the stale tag.

brunovinicius commented 2 weeks ago

Quick update: we made a patch to RxDB to listen to the close event on the websocket connection level and publish it to the pullStrem$ so we can react to it to recover the connection. I am not sure how we should deal with it for RxDB so we will be openning a PR where we can dicuss the fix and how should we go about it.

stale[bot] commented 1 week ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed soon. Please update it or it may be closed to keep our repository organized. The best way is to add some more information or make a pull request with a test case. Also you might get help in fixing it at the RxDB Community Chat If you know you will continue working on this, just write any message to the issue (like "ping") to remove the stale tag.

brunovinicius commented 1 week ago

up

pubkey commented 5 days ago

Do you think it is possible to build a unit test to reproduce this?

brunovinicius commented 5 days ago

Hey @pubkey hope you are feeling better!

We went live with our project a couple weeks ago and we are sorting out issues so not a lot of time available for creating this test, but I expect to put some focus on it soon.

We developed a patch using patch-package that helped a bit, but didnt fully resolve the issue. We can make a PR of this change really quickly if you want to take a look at it. Let me know.

cayres commented 4 days ago

Hi @pubkey, I'm working on the same project as @brunovinicius. Our patch is really simple and based on the ws client events we found here: https://the-guild.dev/graphql/ws/docs/modules/client#event.

diff --git a/node_modules/rxdb/dist/esm/plugins/replication-graphql/index.js b/node_modules/rxdb/dist/esm/plugins/replication-graphql/index.js
index 56d6c7b..65e8e0e 100644
--- a/node_modules/rxdb/dist/esm/plugins/replication-graphql/index.js
+++ b/node_modules/rxdb/dist/esm/plugins/replication-graphql/index.js
@@ -123,6 +123,17 @@ export function replicateGraphQL({
       wsClient.on('connected', () => {
         pullStream$.next('RESYNC');
       });
+
+      wsClient.on('closed', (error) => {
+        pullStream$.error(error);
+        graphqlReplicationState.cancel()
+      });
+
       var query = ensureNotFalsy(pull.streamQueryBuilder)(mutateableClientState.headers);
       wsClient.subscribe(query, {
         next: async streamResponse => {
@@ -138,7 +149,7 @@ export function replicateGraphQL({
         },
         complete: () => {
           pullStream$.complete();
-        }
+        },
       });
     }
     return startBefore();

And now we are subscribing to the replication errors and any close event sent we handle to renew our authorization and restart replications.

We also had to stop the graphql replication because only the web socket is closing at this point, but pull and push are still working. So each time we restart the replication, we increase the number of requests per each operation.