acassen / keepalived

Keepalived
https://www.keepalived.org
GNU General Public License v2.0
4.02k stars 734 forks source link

Keepalived noprempt option does not function as expected and fails over to main master after master comes up #1515

Closed rsparulek closed 4 years ago

rsparulek commented 4 years ago

Describe the bug Keepalived noprempt option does not function as expected and fails over to main master after master comes up. I have 3 master setup and keepalived running and all 3 masters have state as BACKUP and main master has priority as 101 and other 2 masters have priority 100. I also use the noprempt option. We see that when the VIP is present on the main master and main master goes down; vip moves to master 2; but when the main master 1 comes up again; VIP moves again to the main master 1 even though we have noprempt option configured in the VRRP config as pasted below.

To Reproduce Any steps necessary to reproduce the behaviour:

Expected behavior We should not switch over to the main master 1 when it comes back up.

Keepalived version

[root@mynode tmp]# keepalived -v
Keepalived v2.0.7 (08/23,2018)

Copyright(C) 2001-2018 Alexandre Cassen, acassen@gmail.com

Distro (please complete the following information):

[root@mynode ~]# cat /etc/os-release 
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

[root@mynode ~]# uname -r
3.10.0-1062.12.1.el7.x86_64

Details of any containerisation or hosted service (e.g. AWS) Keepalived runs as a systemctl service on the host which has Kubernetes 1.16.3 installed

Configuration file:

[root@mynode tmp]# cat /etc/keepalived/keepalived.conf
global_defs {
        script_user root root
        enable_script_security
        vrrp_garp_master_delay 1
        vrrp_garp_master_refresh 60
}

vrrp_script chk_haproxy {
        script "sudo /usr/bin/killall -0 haproxy"
        interval 2
        fall 2
        rise 2
}

vrrp_script chk_etcd {
        script "sudo /usr/local/bin/retcd get-leader"
        interval 2
        init_fail
}

vrrp_instance VI_1 {
        interface eth0
        state BACKUP
        advert_int 1
        nopreempt
        virtual_router_id 97
        priority 101
        virtual_ipaddress {
            x.x.x.x/x dev eth0
        }
        track_script {
            chk_haproxy
            #chk_etcd
        }
}

Notify and track scripts If any notify or track scripts are in use, please provide copies of them

System Log entries

/var/log/messages when the main master comes up and preempts the existing master and the VIP switches to the main master

Mar  3 15:07:53 cscale-82-163 Keepalived_vrrp[2370]: (VI_1) ip address associated with VRID 97 not present in MASTER advert : 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: (VI_1) Receive advertisement timeout
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: (VI_1) Entering MASTER STATE
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: (VI_1) setting VIPs.
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: (VI_1) Sending/queueing gratuitous ARPs on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: (VI_1) Received advert from 10.9.82.139 with lower priority 101, ours 101, forcing new election
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: (VI_1) Sending/queueing gratuitous ARPs on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: (VI_1) Received advert from 10.9.82.140 with lower priority 101, ours 101, forcing new election
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: (VI_1) Sending/queueing gratuitous ARPs on br0 for 10.9.93.101

Did keepalived coredump? If so, can you please provide a stacktrace from the coredump, using gdb.

Additional context Add any other context about the problem here.

rsparulek commented 4 years ago

@pqarmitage Any pointers on this issue will be great. We are hitting this issue consistently on our keepalived setup with 3 masters causing a brief outage after the second failover happens (inspite of we setting noprempt option in our vrrp config)

pqarmitage commented 4 years ago

I have tested this with the latest version of keepalived (v2.0.20 + all subsequent commits) and I do not observe the problem you have reported.

There is one problem that I observe and that is that the two "backup" systems both have the same VRRP priority. This means that when the high priority instance goes down, both "backup" systems take over as master, and then one of them (the one with the lower IP address) has to fallback to backup. So the first thing I would suggest is reducing the priority of one of the systems to be less than 100, so that all three have different VRRP priorities.

Since the version of keepalived you are using is rather old, I cannot remember if I have fixed any issue with preempt in the last 18 months. It would be worth you upgrading to v2.0.20 to see if that resolves your problem. However, I have also tested this with v2.0.7 and cannot reproduce your problem.

On each of your 3 systems, when keepalived is running could you please send SIGUSR1 to the parent keepalived process and post the resulting /etc/keepalived.data from each system here.

rsparulek commented 4 years ago

@pqarmitage Thanks for your inputs. Could you share the VRRP configuration you are using ? Also, do we have the rpm available for the keepalived 2.0.20 version; I could find rpms only for 2.0.7 online; could you point me to the 2.0.20 rpm ?

My current rpm:

[root@mynode ~]# rpm -qa | grep keepalived
keepalived-2.0.7-1.el7.x86_64
pqarmitage commented 4 years ago

The VRRP configuration I was using was the one you provided above, with two of the 3 systems have the VRRP priority changed to 100.

I don't have an RPM available; you will need to build keepalived from the source. You could download the Fedora source rpm for v2.0.20 and use that to build your own rpm.

rsparulek commented 4 years ago

@pqarmitage When you said you tried to reproduce on 2.0.7 on a 3 node setup; did your main master node with the highest priority also have the highest IP address of all the 3 nodes. In my setup; my main master with the highest priority also has the highest IP address; I am assuming in my case; keepalived breaks the tie using the highest IP address and hence causes a switchover of the VIP back to the highest priority master when it comes back up.

pqarmitage commented 4 years ago

My understanding is that you have one vrrp instance with priority 101 and the other two instances with priority 100. If that is the situation, then the IP address of the instance with priority 101 is not important.

The only time when which IP address is higher is relevant is when two (or more) instances have become master and they all have the same priority. The IP address is then used as a tie break to determine which of the instances should back off and revert to backup; any instance which sees an advert with a higher source IP address reverts to backup.

rsparulek commented 4 years ago

@pqarmitage makes sense. Did you get a chance to take a look at the VRRP logs I posted above; I always see a "forcing new election" log line in the logs of the highest priority master which snatches the VIP once it comes back up even when we have a noprempt option. Is that log line an expected behaviour with noprempt option enabled?

Mar  3 15:07:53 cscale-82-163 Keepalived_vrrp[2370]: (VI_1) ip address associated with VRID 97 not present in MASTER advert : 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: (VI_1) Receive advertisement timeout
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: (VI_1) Entering MASTER STATE
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: (VI_1) setting VIPs.
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: (VI_1) Sending/queueing gratuitous ARPs on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: (VI_1) Received advert from 10.9.82.139 with lower priority 101, ours 101, forcing new election
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: (VI_1) Sending/queueing gratuitous ARPs on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: (VI_1) Received advert from 10.9.82.140 with lower priority 101, ours 101, forcing new election
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  3 15:07:54 cscale-82-163 Keepalived_vrrp[2370]: (VI_1) Sending/queueing gratuitous ARPs on br0 for 10.9.93.101

Ignore the priority numbers as these logs are older when i was using equal priorities for all masters but I see this log line even when i changed priorities as you suggested; why are we forcing a new election every time this node is coming up. New logs posted below with different priorities:

Mar  4 14:47:41 cscale-82-163 Keepalived_vrrp[2404]: (VI_1) Received advert from 10.9.82.140 with lower priority 100, ours 101, forcing new election
Mar  4 14:47:41 cscale-82-163 Keepalived_vrrp[2404]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  4 14:47:41 cscale-82-163 Keepalived_vrrp[2404]: (VI_1) Sending/queueing gratuitous ARPs on br0 for 10.9.93.101
Mar  4 14:47:41 cscale-82-163 Keepalived_vrrp[2404]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  4 14:47:41 cscale-82-163 Keepalived_vrrp[2404]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  4 14:47:41 cscale-82-163 Keepalived_vrrp[2404]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  4 14:47:41 cscale-82-163 Keepalived_vrrp[2404]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  4 14:47:42 cscale-82-163 Keepalived_vrrp[2404]: (VI_1) ip address associated with VRID 97 not present in MASTER advert : 10.9.93.101
Mar  4 14:47:42 cscale-82-163 systemd: Started Session c2612 of user root.
Mar  4 14:47:43 cscale-82-163 Keepalived_vrrp[2404]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  4 14:47:43 cscale-82-163 Keepalived_vrrp[2404]: (VI_1) Sending/queueing gratuitous ARPs on br0 for 10.9.93.101
Mar  4 14:47:43 cscale-82-163 Keepalived_vrrp[2404]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  4 14:47:43 cscale-82-163 Keepalived_vrrp[2404]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  4 14:47:43 cscale-82-163 Keepalived_vrrp[2404]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  4 14:47:43 cscale-82-163 Keepalived_vrrp[2404]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  4 14:47:43 cscale-82-163 Keepalived_vrrp[2404]: (VI_1) ip address associated with VRID 97 not present in MASTER advert : 10.9.93.101
Mar  4 14:47:44 cscale-82-163 Keepalived_vrrp[2404]: (VI_1) ip address associated with VRID 97 not present in MASTER advert : 10.9.93.101
Mar  4 14:47:44 cscale-82-163 systemd: Started Session c2613 of user root.
Mar  4 14:47:45 cscale-82-163 Keepalived_vrrp[2404]: (VI_1) ip address associated with VRID 97 not present in MASTER advert : 10.9.93.101
Mar  4 14:47:46 cscale-82-163 Keepalived_vrrp[2404]: (VI_1) ip address associated with VRID 97 not present in MASTER advert : 10.9.93.101
Mar  4 14:47:46 cscale-82-163 systemd: Started Session c2614 of user root.
Mar  4 14:47:47 cscale-82-163 Keepalived_vrrp[2404]: (VI_1) Received advert from 10.9.82.140 with lower priority 100, ours 101, forcing new election
Mar  4 14:47:47 cscale-82-163 Keepalived_vrrp[2404]: Sending gratuitous ARP on br0 for 10.9.93.101
Mar  4 14:47:47 cscale-82-163 Keepalived_vrrp[2404]: (VI_1) Sending/queueing gratuitous ARPs on br0 for 10.9.93.101
Mar  4 14:47:47 cscale-82-163 Keepalived_vrrp[2404]: Sending gratuitous ARP on br0 for 10.9.93.101
pqarmitage commented 4 years ago

The interesting log line is: Mar 3 15:07:53 cscale-82-163 Keepalived_vrrp[2370]: (VI_1) ip address associated with VRID 97 not present in MASTER advert : 10.9.93.101

This means that your configurations do not match on your three systems, in particular you do not have 10.9.93.101 listed in the virtual_ipaddress block of the lower priority system that becomes master. As a consequence, when the system with priority 101 receives the adverts from a lower priority instance, it discards the advert as invalid. It then times out and becomes master.

If you change the virtual_ipaddress so that they are identical in all the configurations, then it should work as you want.

rsparulek commented 4 years ago

@pqarmitage I have exact same VIPs on all 3 nodes:

[root@cscale-82-163 ~]# cat /etc/keepalived/keepalived.conf
global_defs {
        script_user root root
        enable_script_security
        vrrp_garp_master_delay 1
        vrrp_garp_master_refresh 60
}

vrrp_script chk_haproxy {
        script "sudo /usr/bin/killall -0 haproxy"
        interval 2
        fall 2
        rise 2
}

vrrp_script chk_etcd {
        script "sudo /usr/local/bin/retcd get-leader"
        interval 2
        init_fail
}

vrrp_instance VI_1 {
        interface br0
        state BACKUP
        advert_int 1
        nopreempt
        virtual_router_id 97
        priority 101
        virtual_ipaddress {
            10.9.93.101/16 dev br0
        }
        track_script {
            chk_haproxy
            #chk_etcd
        }
}
[root@cscale-82-139 ~]# cat /etc/keepalived/keepalived.conf
global_defs {
        script_user root root
        enable_script_security
        vrrp_garp_master_delay 1
        vrrp_garp_master_refresh 60
}

vrrp_script chk_haproxy {
        script "sudo /usr/bin/killall -0 haproxy"
        interval 2
        fall 2
        rise 2
}

vrrp_script chk_etcd {
        script "sudo /usr/local/bin/retcd get-leader"
        interval 2
        init_fail
}

vrrp_instance VI_1 {
        interface br0
        state BACKUP
        advert_int 1
        nopreempt
        virtual_router_id 97
        priority 95
        virtual_ipaddress {
            10.9.93.101/16 dev br0
        }
        track_script {
            chk_haproxy
            #chk_etcd
        }
}
[root@cscale-82-140 ~]# cat /etc/keepalived/keepalived.conf
global_defs {
        script_user root root
        enable_script_security
        vrrp_garp_master_delay 1
        vrrp_garp_master_refresh 60
}

vrrp_script chk_haproxy {
        script "sudo /usr/bin/killall -0 haproxy"
        interval 2
        fall 2
        rise 2
}

vrrp_script chk_etcd {
        script "sudo /usr/local/bin/retcd get-leader"
        interval 2
        init_fail
}

vrrp_instance VI_1 {
        interface br0
        state BACKUP
        advert_int 1
        nopreempt
        virtual_router_id 97
        priority 95
        virtual_ipaddress {
            10.9.93.101/16 dev br0
        }
        track_script {
            chk_haproxy
            #chk_etcd
        }
}
pqarmitage commented 4 years ago

@rsparulek As I wrote above, the problem is indicated by the log entries which say: Mar 3 15:07:53 cscale-82-163 Keepalived_vrrp[2370]: (VI_1) ip address associated with VRID 97 not present in MASTER advert : 10.9.93.101

This means that the adverts received on cscale-82-163 do not contain the address 10.9.93.101. As a consequence of this the adverts are discarded by keepalived on cscale-82-163 and since it sees no adverts other than the ones it discards, it times out and becomes master.

In order to progress this any further, I need to see the following:

  1. Send SIGUSR1 to the parent keepalived process on each system. This will create the file /tmp/keepalived.data. Please attach each of the /tmp/keepalived.data files to this issue.

  2. On cscale-82-163 when you are getting the ip address associated with VRID 97 not present in MASTER advert messages, run tcpdump -n -nn -v -i br0 proto 112 and post the output of that here too.

Once we have these, we should be able to see what is happening.

rsparulek commented 4 years ago

capture.txt keepalived.data.139.txt keepalived.data.140.txt keepalived.data.163.txt

@pqarmitage Uploaded the logs and data you requested hereby on my cluster nodes. Let me know if anything else is needed from my end.

Many Thanks! Your help is greatly appreciated.

pqarmitage commented 4 years ago

@rsparulek You STILL have cscale-82-139 and cscale-82-140 having the same VRRP priority, i.e. 95. This will not work properly since when keepalived on cscale-82-163 stops, both cscale-82-139 and cscale-82-140 will become master at the same time, and then one will subsequently have to revert to backup. There is not much point in continuing with this issue until the VRRP instances on all 3 systems have DIFFERENT priorities.

You seem to have a large amount of VRRP traffic, as follows:

VRID            Source IP       Priority         Virtual IP addresses

vrid 106        10.9.20.13      prio 100         10.9.108.106
vrid 107        10.9.82.70      prio 101         10.9.108.107
vrid 11         10.9.82.64      prio 101         10.9.111.124
vrid 113        10.9.140.110    prio 100         10.9.86.90
vrid 12         10.9.82.61      prio 101         10.9.111.125
vrid 150        10.9.60.239     prio 101         10.9.119.150
vrid 151        10.9.50.103     prio 101         10.9.119.151
vrid 152        10.9.60.208     prio 101         10.9.119.152
vrid 159        10.9.40.215     prio 101         10.9.119.159
vrid 164        10.9.82.150     prio 101         10.9.108.175
vrid 166        10.9.109.74     prio 101         10.9.109.75
vrid 169        10.9.60.226     prio 101         10.9.114.34 
vrid 171        10.9.82.49      prio 101         10.9.97.100 
vrid 200        10.9.60.72      prio 101         10.9.99.210 
vrid 21         10.9.82.67      prio 100         10.9.98.52
vrid 250        10.9.100.206    prio 101         10.9.100.250
vrid 253        10.9.140.101    prio 101         10.9.82.253
vrid 40         10.9.109.71     prio 101         10.9.109.72
vrid 41         10.9.109.81     prio 101         10.9.109.84
vrid 43         10.9.100.103    prio 101         10.9.117.2
vrid 54         10.9.60.244     prio 100         10.9.114.34
vrid 59         10.9.60.71      prio 101         10.9.99.200 
vrid 64         10.9.121.21     prio 101         10.9.121.24 
vrid 77         10.9.40.105     prio 101         10.9.232.182
vrid 88         10.9.82.143     prio 101         10.9.82.250
vrid 93         10.9.120.108    prio 101         10.9.94.3
vrid 94         10.9.60.31      prio 101         10.9.118.254

vrid 97         10.9.82.139     prio 95          10.9.93.101 
vrid 97         10.9.82.140     prio 95          10.9.93.101 
vrid 97         10.9.82.163     prio 101         10.9.93.101 
vrid 97         10.9.82.81      prio 101         10.9.115.1
vrid 97         10.9.82.82      prio 100         10.9.115.1
vrid 97         10.9.82.83      prio 100         10.9.115.1

and this identifies the main cause of your problem, which is that you have another set of systems (with IP adddresses 10.9.82.81, 10.9.82.82, and 10.9.82.83) also using VRID 97, but with virtual ipaddress 10.9.115.1. For some reason all three of these systems are in MASTER state, and whether that is caused by 10.9.82.140/139/163 also transmitting adverts with VRID 97 but with a different virtual ipaddress I cannot say.

Once you sort out the duplicate VRID problem, and also change it so that for each VRID each system has a different priority, then I expect your problem will be resolved.

rsparulek commented 4 years ago

@pqarmitage Many thanks for your inputs and debugging! I don't see this issue when I have unique priorities across 3 masters. I am still using keepalived 2.0.7 version. I will update this bug if I see any issue again.

rsparulek commented 4 years ago

@pqarmitage This issue is resolved. Closing for now. Thanks for all your help! I will re-open if I hit any other issues.

rsparulek commented 4 years ago

@pqarmitage I am re-hitting the VIP switchover issue even when I have 3 different priorities on my 3 masters as seen below:

Main master:

# cat /etc/keepalived/keepalived.conf
global_defs {
        script_user root root
        enable_script_security
        vrrp_garp_master_delay 1
        vrrp_garp_master_refresh 60
}

vrrp_script chk_haproxy {
        script "sudo /usr/bin/killall -0 haproxy"
        interval 2
        fall 2
        rise 2
}

vrrp_script chk_etcd {
        script "sudo /usr/local/bin/retcd get-leader"
        interval 2
        init_fail
}

vrrp_instance VI_1 {
        interface br0
        state BACKUP
        advert_int 1
        nopreempt
        virtual_router_id 43
        priority 101
        virtual_ipaddress {
            10.9.117.2/16 dev br0
        }
        track_script {
            chk_haproxy
            #chk_etcd
        }
}

Master 2:

global_defs {
        script_user root root
        enable_script_security
        vrrp_garp_master_delay 1
        vrrp_garp_master_refresh 60
}

vrrp_script chk_haproxy {
        script "sudo /usr/bin/killall -0 haproxy"
        interval 2
        fall 2
        rise 2
}

vrrp_script chk_etcd {
        script "sudo /usr/local/bin/retcd get-leader"
        interval 2
        init_fail
}

vrrp_instance VI_1 {
        interface br0
        state BACKUP
        advert_int 1
        nopreempt
        virtual_router_id 43
        priority 95
        virtual_ipaddress {
            10.9.117.2/16 dev br0
        }
        track_script {
            chk_haproxy
            #chk_etcd
        }
}

Master 3:

global_defs {
        script_user root root
        enable_script_security
        vrrp_garp_master_delay 1
        vrrp_garp_master_refresh 60
}

vrrp_script chk_haproxy {
        script "sudo /usr/bin/killall -0 haproxy"
        interval 2
        fall 2
        rise 2
}

vrrp_script chk_etcd {
        script "sudo /usr/local/bin/retcd get-leader"
        interval 2
        init_fail
}

vrrp_instance VI_1 {
        interface br0
        state BACKUP
        advert_int 1
        nopreempt
        virtual_router_id 43
        priority 90
        virtual_ipaddress {
            10.9.117.2/16 dev br0
        }
        track_script {
            chk_haproxy
            #chk_etcd
        }
}

From keepalived logs on secondary master; I see that when the master 1 comes up; the secondary master gives up it's VIP as seen from logs below on the secondary master:

Mar 27 15:25:04 eqx04-flash06 Keepalived_vrrp[25180]: (VI_1) Sending/queueing gratuitous ARPs on br0 for 10.9.117.2
Mar 27 15:25:05 eqx04-flash06 systemd: Started Session c39586 of user root.
Mar 27 15:25:07 eqx04-flash06 systemd: Started Session c39587 of user root.
Mar 27 15:25:08 eqx04-flash06 Keepalived_vrrp[25180]: (VI_1) Master received advert from 10.9.100.103 with higher priority 101, ours 95
Mar 27 15:25:08 eqx04-flash06 Keepalived_vrrp[25180]: (VI_1) Entering BACKUP STATE
Mar 27 15:25:08 eqx04-flash06 Keepalived_vrrp[25180]: (VI_1) removing VIPs.

On the main master; I see that the forceful VIP re-acquisition happens:

Mar 27 15:25:05 eqx01-flash03 systemd: Starting DNS caching server....
Mar 27 15:25:05 eqx01-flash03 Keepalived_vrrp[2158]: (VI_1) Entering BACKUP STATE
Mar 27 15:25:09 eqx01-flash03 Keepalived_vrrp[2158]: (VI_1) Receive advertisement timeout
Mar 27 15:25:09 eqx01-flash03 Keepalived_vrrp[2158]: (VI_1) Entering MASTER STATE
Mar 27 15:25:09 eqx01-flash03 Keepalived_vrrp[2158]: (VI_1) setting VIPs.
Mar 27 15:25:09 eqx01-flash03 systemd: Starting Wait for Plymouth Boot Screen to Quit...
Mar 27 15:25:09 eqx01-flash03 Keepalived_vrrp[2158]: Sending gratuitous ARP on br0 for 10.9.117.2
Mar 27 15:25:09 eqx01-flash03 Keepalived_vrrp[2158]: (VI_1) Sending/queueing gratuitous ARPs on br0 for 10.9.117.2
Mar 27 15:25:09 eqx01-flash03 Keepalived_vrrp[2158]: Sending gratuitous ARP on br0 for 10.9.117.2
Mar 27 15:25:09 eqx01-flash03 Keepalived_vrrp[2158]: Sending gratuitous ARP on br0 for 10.9.117.2
Mar 27 15:25:09 eqx01-flash03 Keepalived_vrrp[2158]: Sending gratuitous ARP on br0 for 10.9.117.2
Mar 27 15:25:09 eqx01-flash03 Keepalived_vrrp[2158]: Sending gratuitous ARP on br0 for 10.9.117.2

I see from above logs that the main master did enter BACKUP state to begin with but later re-entered MASTER state; could you help me with figuring out this issue?

rsparulek commented 4 years ago

I am also attaching the keepalived.data files collected from all 3 masters: m1.txt m2.txt m3.txt

m1.txt for main master other 2 files for secondary masters

rsparulek commented 4 years ago

@pqarmitage Any pointers on this issue?

pqarmitage commented 4 years ago

We have seen the issue before where when networking starts up packets start being passed, and then no packets are passed for a while, and then they start being passed again, and this is what appears to be happening here.

You need to delay keepalived starting up until the network is fully up and settled.

pqarmitage commented 4 years ago

There is a keepalived option vrrp_startup_delay for precisely this problem. It delays the startup of the vrrp process to allow time for the networking to settle down.

I am closing this problem now since there has been no response for over a week.