openfaas / nats-queue-worker

Queue-worker for OpenFaaS with NATS Streaming
https://docs.openfaas.com/reference/async/
MIT License
128 stars 59 forks source link

Update publisher to re-connect #33

Closed alexellis closed 5 years ago

alexellis commented 6 years ago

Signed-off-by: Alex Ellis (VMware) alexellis2@gmail.com

Description

The publisher will now re-connect when NATS Streaming becomes unavailable. Tested within the gateway code on Docker Swarm.

Thanks to @vosmith for initiating this work.

Motivation and Context

17

How Has This Been Tested?

In the gateway, by scaling NATS Streaming to zero replicas and back again.

Types of changes

@vosmith @kozlovic @stefanprodan how is this looking?

I want to move the work started by @vosmith forward gradually. I realize that an in-memory restart is not ideal, so we're also looking to provide a configuration with some form of "ft" or PV.

alexellis commented 6 years ago

Thanks for the feedback. If we moved to the 0.12.0 version, how would you feel about spending a half hour or so to write the patch to fix up the re-connect logic?

kozlovic commented 5 years ago

@alexellis I did not have time to have a look, but I see that you are now using 0.4.0 client lib, which has the connection lost handler. If you upgrade to 0.11.2 server, then you could make use of that. I can try to submit a PR when I get the chance. However, not sure how I would be testing that, so may ask you to have a go once you have the PR.

daikeren commented 5 years ago

@alexellis Is there anything I could help to fix this issue? Currently we're facing the same issue and the only way to solve it is to restart queue-worker and faas gateway. Thanks

alexellis commented 5 years ago

Derek close: implemented via other PRs

vosmith commented 5 years ago

:tada::tada::tada:

alexellis commented 5 years ago

Thanks @vosmith this has been in both the queue worker and gateway for several releases now. I wanted to close the old PR.