rabbitmq / rabbitmq-server

Open source RabbitMQ: core server and tier 1 (built-in) plugins
https://www.rabbitmq.com/
Other
11.84k stars 3.9k forks source link

Add some validation to peer discovery #11586

Open lukebakken opened 4 days ago

lukebakken commented 4 days ago

Is your feature request related to a problem? Please describe.

In this issue, I thought I had uncovered a bug in inter-node TLS and peer discovery, when in fact I had a simple configuration error in my environment:

https://github.com/lukebakken/docker-rabbitmq-cluster/blob/69791a36b664ed448b970ca0f2ead1742e4822b4/rmq1/rabbitmq.conf#L12-L14

Note that there is a duplicated node, and that node rmq0.local is not in that list. This error was only discovered after @dumbbell carefully looked at debug logs.

Describe the solution you'd like

RabbitMQ should log warnings, or perhaps, even fail to start when an invalid list of nodes is returned by a peer discovery backend:

https://github.com/rabbitmq/rabbitmq-server/blob/447fac9feb756275f94fe372d8f91885437d4e96/deps/rabbit/src/rabbit_peer_discovery.erl#L293

In the case of the classic peer discovery backend, @dumbbell suggested that the following are invalid:

@michaelklishin should be able to verify if the above should apply to all backends (k8s, consul, AWS, etc).

michaelklishin commented 4 days ago

But what is an "invalid list of nodes"? Peer discovery by definition seeks to return a list of nodes that are not yet known to the current node.

lukebakken commented 4 days ago

Right, but, as I saw in #11534, if the list of nodes returned from a backend does not include the current node, it causes issues. Likewise, a list that contains duplicates is suspect as well.