Closed LSChyi closed 5 years ago
You state
As this case is using the same priority, keepalived should follow the VRRP protocol and assign the MASTER role to the host with the higher IP address.
This is an incorrect interpretation of the VRRP RFC(s). If a virtual router in backup state receives an advert of equal priority to its own priority, it treats it as a valid advert and processes it the same as if the priority in the advert were higher than its own. Only if the priority in the advert is less that the receiver's priority is the advert discarded.
See RFC5798 section 6.4.2 (backup state):
(420) - If an ADVERTISEMENT is received, then:
(425) + If the Priority in the ADVERTISEMENT is zero, then:
(430) * Set the Master_Down_Timer to Skew_Time
(440) + else // priority non-zero
(445) * If Preempt_Mode is False, or if the Priority in the ADVERTISEMENT is greater than or equal to the local Priority, then:
(450) @ Set Master_Adver_Interval to Adver Interval contained in the ADVERTISEMENT
(455) @ Recompute the Master_Down_Interval
(460) @ Reset the Master_Down_Timer to Master_Down_Interval
(465) * else // preempt was true or priority was less
(470) @ Discard the ADVERTISEMENT
(475) *endif // preempt test
(480) +endif // was priority zero?
(485) -endif // was advertisement recv?
(490) endwhile // Backup state
The only time when IP address comparison is used to determine which virtual router should be master is if a master receives an advert with equal priority, then the one with the lower IP address drops back to backup. See RFC5798 section 6.4.3:
(725) -* If the Priority in the ADVERTISEMENT is greater than the local Priority,
(730) -* or
(735) -* If the Priority in the ADVERTISEMENT is equal to
the local Priority and the primary IPvX Address of the
sender is greater than the local primary IPvX Address, then:
There is no known issue of keepalived not received multicast adverts when the environment is correctly set up, and indeed you indicate Generally, VRRP messages are processed on both hosts
, which indicates that keepalived can receive them. We always find that when keepalived is not receiving adverts, especially in virtualised or containerised environments, that the issue comes down to the setup of the environment.
If you wish to take this further, can you please ensure you use version v2.0.19 of keepalived rather than v1.2.24, since that is the version against which we would diagnose the problem, and keepalived works quite differently in the two versions. It will also be necessary to provide far more information than just keepalived is not receiving multicast messages
, since we need to have some information to work with, and there are thousands of keepalived installations that are working as expected. For example, one piece of information that coud be provided is does the output of netstat -anp
show a large Recv-Q for the keepalived receive socket, but that is just one small example.
In order to see more about what is happening inside keepalived, you could build keepalived with the --enable-epoll-debug --enable-epoll-thread-dump --enable-log-file
configure options, and then run keepalived with the -D -g/tmp/keepalived.log -G --flush-log-file --debug=EvDv
. This will write log output to files /tmp/keepalived*.log rather than syslog, and produce lots of debugging output showing what epoll events are received, what threads are queued and what functions are being called. From this it is possible to see whether keepalived is receiving adverts and how it is processing them.
Since this does not appear to be a keepalived issue I am closing it for now, but if you provide more information that demonstrates that it is keepalived is not working properly, we can reopen the issue.
Describe the bug In ESXi 6.7, I set up two VMs with keepalived (see below the configuration). As this case is using the same priority, keepalived should follow the VRRP protocol and assign the MASTER role to the host with the higher IP address. However, sometimes both hosts become MASTER. I used
tcpdump -Q in
on both hosts to verify that the VRRP multicast packets arrive on both hosts. Although the packets arrive at each host (and on the right interface), the keepalived process on one host is not processing them, whereas on the other host I get the expected log entryReceived lower prio advert 100, forcing new election
. Generally, VRRP messages are processed on both hosts.To Reproduce An easy way to reproduce the problem is by swapping the IPs of the interfaces used by VRRP on an already running keepalived group. To do so change them first in the Ubuntu configuration (/etc/network/interfaces) on both hosts and then reboot them.
Expected behavior Both keepalived processes on each host should receive the VRRP multicast message and elect a new MASTER.
Keepalived version Reproducible on both
1.2.24
and2.0.19
.Distro (please complete the following information):
Details of any containerisation or hosted service (e.g. AWS) No
Configuration file: Both hosts use the same configuration
Notify and track scripts No
System Log entries The following logs are generated by keepalived
1.2.24
. The logs for the host that is not receiving the missing VRRP multicast message:And the logs from the other host that receives the VRRP multicast message:
Did keepalived coredump? No
Additional context No