rabbitmq / rabbitmq-server

Open source RabbitMQ: core server and tier 1 (built-in) plugins
https://www.rabbitmq.com/
Other
12.33k stars 3.92k forks source link

rabbit_peer_discovery: Retry RPC calls #12801

Closed dumbbell closed 3 days ago

dumbbell commented 4 days ago

Why

In CI, we observe some timeouts in the Erlang distribution connections between the temporary hidden node and the nodes it queries. This affects peer discovery obviously.

How

We introduce some query retries to reduce the risk of an incomplete query.

While here, we move the sorting of queried nodes from the query_node_props2/3 last clause (executed in the temporary hidden node) to the function setting the temporary hidden node and asking for these queries. This way the debug messages from that sorting are logged by RabbitMQ out of the box.

The branch contains two additional related commits:

  1. Remove the use of group leader proxy. It is useless after we used the standard_io connection option to start the temporary hidden node.
  2. Fix non-tail-recursive query_node_props2/3.