codership / galera

Synchronous multi-master replication library
GNU General Public License v2.0
447 stars 177 forks source link

Galera arbitrator 4 strange random IP substitutions #637

Closed jekka-ua closed 1 year ago

jekka-ua commented 1 year ago

Our galera cluster is set with on different hosts with docker so routing to 10.100.XX.YY (dockerized cluster node) is set via 10.100.100.XX (host)

In garbd.log the usual situation is:

2023-02-21 12:54:13.197  INFO: gcomm: connecting to group 'bayern', peer '10.100.22.101:,10.100.35.101:,10.100.26.152:'
2023-02-21 12:54:13.200  INFO: (d75a6929-b55e, 'tcp://0.0.0.0:4567') connection established to b04c17f7-878b tcp://10.100.26.152:4567
2023-02-21 12:54:13.201  INFO: (d75a6929-b55e, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.100.100.35:4567
2023-02-21 12:54:13.224  INFO: (d75a6929-b55e, 'tcp://0.0.0.0:4567') connection established to 081ca62c-8db3 tcp://10.100.35.101:4567
2023-02-21 12:54:13.239  INFO: (d75a6929-b55e, 'tcp://0.0.0.0:4567') connection established to 15806b84-bedb tcp://10.100.22.101:4567
2023-02-21 12:54:13.737  INFO: EVS version upgrade 0 -> 1
2023-02-21 12:54:13.737  INFO: declaring 081ca62c-8db3 at tcp://10.100.100.35:4567 stable
2023-02-21 12:54:13.737  INFO: declaring 15806b84-bedb at tcp://10.100.22.101:4567 stable
2023-02-21 12:54:13.737  INFO: declaring b04c17f7-878b at tcp://10.100.100.26:4567 stable

So there are right IPs when connection established but randomly wrong when 'declaring'.

While ports 10.100.100.XX:4567 are always closed, from time to time difference between 'establishing' and 'declaring' is random for different "gcomm" IPs.

Not sure if this behavior is normal

jekka-ua commented 1 year ago

iptables did that =(