Closed doodle-tnw closed 5 years ago
Details matter a lot here and we don't have server logs or a traffic capture, so I have to guess. There are two most common scenarios: the node that goes down is the node that hosted queue master or the node that this client was connected to. In the latter case your app must be ready to recover as demonstrated in the examples.
If a queue master fails and you were consuming from a different node, modern RabbitMQ versions will re-register the consumer after electing a new master. Which may or may not happen depending on queue settings documented in the mirroring guide. There are also Consumer cancelation notifications that are relevant here.
We have a 5 node RabbitMQ cluster and a number of go apps using it for pub/sub.
The pub/sub code follows closely the pub/sub example, and if there is an error or consuming stops it tries to consume on another session. If i drop the rabbitmq broker in the cluster that serves the queue, it just sits there and hangs at
channel.Consume
(previously it was hanging atchannel.QueueDeclare
andchannel.QueueBind
, but they are now moved to only run first connection only).In the logs i see the before consume but neither the subscribed message or the error message. It never seems to get past.
This was happening earlier on calls to
channel.QueueDeclare
andchannel.QueueBind
but as i said i moved them to testThe only thing i can think of is because the queue was on the broker that was taken down, that it has something to do with the fact that the broker is not back up yet before
channel.Consume
is called.Any help greatly appreciated