Open luos opened 1 month ago
You'd have to start recording each member's assigned "UId" in the queue record and base the recovery of the member on whether the current UId for the given cluster name matches or not.
even so you could reproduce a similar issue by partitioning a node, delete and re-create the queue on the majority side then re-join the partitioned node.
I see, in the case you are proposing it's more of RabbitMQs responsibility to recover / not recover the member if the uid changed.
I think it's a bit more resilient if this would be included in ra
, ie. pre_vote
would check membership.
Though thinking more about it - both sides are needed, and more probably.
One for RabbitMQ to clean up / not start removed members on startup, an implementation in ra
to not allow the partitioned node to become a leader again, and another where RabbitMQ gets notified if an out of date uid
member shows up, so it can do the cleanup.
The uids aren't exchanged in the raft commands so changing Ra would not be easy to do. It is better to put the responsibility to ensure the right members are running on the system running them if at all possible.
Describe the bug
Hi,
The issue below involves deleting and recreating a queue while a node is down, which means that most users will not be affected by this.
We've identified an issue with Quorum Queues which causes an out of date replica to come back as a leader again, resending past log messages, causing the now follower to reapply local effects, causing the new consumer to receive messages which were already processed.
This leads to duplicate message delivery - even though the messages were acknowledged properly and the queue processed the acks, etc. Basically the log will be replayed in its entirety, meaning messages processed days ago can reappear.
The effect of it is similar to https://github.com/rabbitmq/ra/issues/387.
This issue causes the queue to actually become broken in some scenarios, but that is expected due to the bad internal state.
We know that the proper solution is to not delete the queue but probably
ra
should also have some built in protection to not allow out of date members to rejoin the cluster - at least not to become leaders.I think a potential solution would be is to include
cluster id
inpre_vote
andrequest_vote_rpc
messages. According to my review, today there is no shared cluster ID for the ra clusters. There is auid
but that is for the server, not for the cluster.Reproduction steps
append_entries
for these log itemsExpected behavior
One or all of the following: :-)
Additional context
I can share some traces or debug output, not sure it makes sense without context.
Attached the "restart sequence", nothing special.
restart.sh.txt