nats-io / nats-replicator

Bridge to replicate NATS Subjects or Channels to NATS Subject or Channels
Apache License 2.0
20 stars 7 forks source link

Stan connection may not get closed when nats goes down #2

Open sasbury opened 5 years ago

sasbury commented 5 years ago

ug 21 00:26:27 ip-10-188-2-114 nats-replicator[18223]: 2019/08/21 00:26:27.634026 [INF] error restarting streaming connection, will retry in 5000 milliseconds%!(EXTRA string=stan: clientID already registered) Aug 21 00:26:32 ip-10-188-2-114 nats-replicator[18223]: 2019/08/21 00:26:32.634170 [INF] trying to reconnect to nats streaming Aug 21 00:26:32 ip-10-188-2-114 nats-replicator[18223]: 2019/08/21 00:26:32.634198 [INF] connecting to NATS streaming with configuration stan-eu-p1, cluster id is rg-cluster-eu-central-1 Aug 21 00:26:32 ip-10-188-2-114 nats-replicator[18223]: 2019/08/21 00:26:32.635578 [INF] error restarting streaming connection, will retry in 5000 milliseconds%!(EXTRA string=stan: clientID already registered) Aug 21 00:26:37 ip-10-188-2-114 nats-replicator[18223]: 2019/08/21 00:26:37.635728 [INF] trying to reconnect to nats streaming Aug 21 00:26:37 ip-10-188-2-114 nats-replicator[18223]: 2019/08/21 00:26:37.635755 [INF] connecting to NATS streaming with configuration stan-eu-p1, cluster id is rg-cluster-eu-central-1 Aug 21 00:26:37 ip-10-188-2-114 nats-replicator[18223]: 2019/08/21 00:26:37.636792 [INF] error restarting streaming connection, will retry in 5000 milliseconds%!(EXTRA string=stan: clientID already registered)

Response from Ivan:

it means that the server still has this client ID and the client has the inbox running and can reply to the server check. It could be that the app got NATS disconnected and incorrectly tried to recreate the STAN connection (without closing the old one)?

Todo:

Look at logic when nats needs a restart, streaming that use that should be shutdown

sasbury commented 5 years ago

could be that we are too agressive closing the stan connection or at least "forgetting" about it somehow so we try to recreate when we shouldn't

sasbury commented 5 years ago

set the connectionlost handler and only try to reconnect when you get notification from that cb, not the low level NATS disconnect handler

typecampo commented 4 years ago

Will this get fixed?