Closed pvanderlinden closed 6 years ago
I'm looking into fixing this, as it blocks us from using NATS. My main question is: what is the logic supposed to do, as I currently see several strange things?:
max_reconnect_attempts
will not be used for the whole lifetime of the clientI'm mainly asking, because I'm assuming the logic is supposed to be similar across the different clients.
Thanks for your patience on this issue. I revised the reconnection logic and there were a few places where it was not working in the same way as in the Go client, now have a branch here where behavior should be improved if want to take a look: https://github.com/nats-io/asyncio-nats/pull/67
More about how to the reconnection logic ought to work trying to cover some of the points above:
Servers to which the client attempts to connect more than max_reconnect_attempts
should be dropped from the server pool. Whenever it can connect to the server successfully, this state should be reset to 0.
During the initial connect, in case the client was not able to connect then an ErrNoServers
exception is thrown and no attempts are tried further (reconnect logic does not occur), so the implementation is slightly different compared to reconnect.
Once the client has successfully connected to the NATS at least once, then whenever it is disconnected the reconnection logic will kick in and retry a number of times. If it cannot reconnect after the max_reconnect_attempts
, then it will enter the closed state and the closed_cb
will be called. Once the closed_cb
the client will not be usable anymore so could be considered to be the 'permanent failure' more or less.
Thanks again for pinging on this, I will merge the branch soon and make a release containing the fixes.
@pvanderlinden behavior in master should be improved now, I will make a release this week including the fix.
Thanks, This seems to work all fine now (with the last commit on the master branch)!
@pvanderlinden Thanks for checking, have just released this in the v0.7.2 version of the client.
Originally posted here: https://github.com/nats-io/asyncio-nats-streaming/issues/7 I can not get reconnects working, I have tested it with a program publishing and consuming, if I shut down the server for a few seconds, they both will stop working (the publisher will timeout on publishing, the consumer will just not receive any messages anymore)
Connection code:
Output:
What I understand from the defaults it should try to reconnect 10 times with a delay of 2 seconds, which means if the server is down for less then 20 seconds it should at least reconnect and resume operation, unfortunately it won't resume operation (the consumer will just not receive anything anymore, the producer will timeout on a publish call, even when the server is back up again).