Closed banks closed 7 years ago
As we already discussed in gitter chat error handling can really be better.
I am preparing v1.3.0 release at moment and for now commited small change that must help to prevent permanent disconnect in case of internal server error.
Now in v1.3.2 errors can contain optional advice
(fix
or retry
). fix
means that developer using Centrifugo did something wrong, i.e. using unknown namespace in channels or channels too long - so developer should fix this in order things to work properly. retry
signals temporary error that could theoretically be successful if client try executing command again. These advices do not handled by javascript client automatically but included in error context.
disconnect
command with reconnect
boolean flag used to control client's connection - disconnect forever or disconnect with reconnect. ErrInternalServerError
results in sending disconnect
command with reconnect.
I think this is not actual anymore - many improvements here have been done, some will be released with Centrifugo v1.6.0, let's close this.
Since my comment a few days ago, centrifugo now disconnects with
reconnect: false
for almost any error that occurs while processing a message. The vast majority of these are not fatal errors and should not prevent the client from retrying.For example, now I my client connects and the call to
subscribe
fails due to a temporary redis networking glitch, the client is permanently disconnected.When trying to figure out how different errors should behave I realise that there are 3 different error scopes:
In case of 3, the retry option actually applies to API call not whole connection in most cases. I put * next to connection OK in last case because it's arguable that you should disconnect them here. The benefit of disconnect is that a network partition where one centrifugo host is unable to toal to backend but others are will recover itself. But in general it means disconnect/reconnect overhead for no reason if the connection failure was just temporary. Could have hybrid that keeps track of how many successive API calls failed on a connection and disconnect after 2 or 3 consecutive ones? That's out of scope anyway for now current behaviour is OK as long as we have defined ways to handle each type and tell the difference.
Finally, one option would be to return distinct error codes for each type of error and let client (ours or end-user) choose to retry or not.
That might make some sense, however the downside is that there will be long-lived clients out there and you risk having them retry indefinitely when you introduce a new fatal error into server later (or vice-versa) because they don't know.
So I like the current model of returning advice on whether client should retry requests and/or reconnect in each case. As that is future proof.