acassen / keepalived

Keepalived
https://www.keepalived.org
GNU General Public License v2.0
3.98k stars 735 forks source link

Crash with duplicated vmac #1799

Closed louis-6wind closed 3 years ago

louis-6wind commented 3 years ago

Describe the bug Process crash when loading a duplicate vmac. Same behaviour on reload / restart

To Reproduce rm /var/lib/systemd/coredump/* systemctl daemon-reload ifconfig ens4 192.168.15.41/28 up cat>/etc/keepalived/keepalived.conf <<\EOF global_defs { router_id router enable_script_security script_user root dynamic_interfaces }

vrrp_sync_group group15 { group { vrrp vrrp2 } }

vrrp_track_file fp_tracker { file /var/run/keepalived/fp-tracker weight 0 }

vrrp_instance vrrp { version 2 state BACKUP interface ens4

use_vmac vrrp

track_file {}

garp_master_delay 5

virtual_router_id 15

priority 200
advert_int 1.0

virtual_ipaddress {
    192.168.15.38/28
}

preempt_delay 0

} vrrp_instance vrrp2 { version 2 state BACKUP interface ens4

use_vmac vrrp

track_file {}

garp_master_delay 5

virtual_router_id 15

priority 200
advert_int 1.0

virtual_ipaddress {
    192.168.15.39/28
}

preempt_delay 0

} EOF systemctl restart keepalived

Expected behavior no crash. Second interface is discarded

Keepalived version Keepalived v2.1.5 (07/13,2020)

Copyright(C) 2001-2020 Alexandre Cassen, acassen@gmail.com

Built with kernel headers for Linux 4.15.18 Running on Linux 4.15.0-123-generic #126-Ubuntu SMP Wed Oct 21 09:40:11 UTC 2020

configure options: --prefix=/usr --sysconfdir=/etc --with-extra-cflags=-I/usr/include/libnl3 --with-extra-ldflags= --with-extra-libs=-lnl-genl-3 --disable-lvs --with-init=systemd --host=x86_64-linux-gnu host_alias=x86_64-linux-gnu

Config options: VRRP VRRP_AUTH OLD_CHKSUM_COMPAT FIB_ROUTING

System options: PIPE2 SIGNALFD INOTIFY_INIT1 VSYSLOG EPOLL_CREATE1 IPV4_DEVCONF IPV6_ADVANCED_API RTA_ENCAP RTA_EXPIRES RTA_NEWDST RTA_PREF FRA_SUPPRESS_PREFIXLEN FRA_SUPPRESS_IFGROUP FRA_TUN_ID RTAX_CC_ALGO RTAX_QUICKACK RTEXT_FILTER_SKIP_STATS FRA_L3MDEV FRA_UID_RANGE RTAX_FASTOPEN_NO_COOKIE RTA_VIA FRA_OIFNAME RTA_TTL_PROPAGATE IFA_FLAGS IP_MULTICAST_ALL LWTUNNEL_ENCAP_MPLS LWTUNNEL_ENCAP_ILA NET_LINUX_IF_H_COLLISION LIBIPTC_LINUX_NET_IF_H_COLLISION VRRP_VMAC VRRP_IPVLAN IFLA_LINK_NETNSID CN_PROC SOCK_NONBLOCK SOCK_CLOEXEC O_PATH GLOB_BRACE INET6_ADDR_GEN_MODE VRF SO_MARK SCHED_RESET_ON_FORK

Distro (please complete the following information):

Details of any containerisation or hosted service (e.g. AWS) no container

Configuration file: see to reproduce

Notify and track scripts no

System Log entries Dec 03 13:28:39 ha1 Keepalived[14866]: Starting VRRP child process, pid=14954 Dec 03 13:28:39 ha1 Keepalived_vrrp[14954]: Registering Kernel netlink reflector Dec 03 13:28:39 ha1 systemd[1]: Started Process Core Dump (PID 14955/UID 0). Dec 03 13:28:39 ha1 Keepalived_vrrp[14954]: Registering Kernel netlink command channel Dec 03 13:28:39 ha1 Keepalived_vrrp[14954]: Opening file '/etc/keepalived/keepalived.conf'. Dec 03 13:28:39 ha1 Keepalived_vrrp[14954]: "vrrp_track_file" is deprecated, please use "track_file" Dec 03 13:28:39 ha1 Keepalived_vrrp[14954]: vrrp and vrrp2 both use VRID 15 with IPv4 on interface ens4 Dec 03 13:28:39 ha1 Keepalived[14866]: pid 14954 exited due to signal 6 (Aborted) Dec 03 13:28:39 ha1 Keepalived[14866]: VRRP child process(14954) died: Respawning Dec 03 13:28:39 ha1 Keepalived[14866]: Restart of VRRP process delayed 60 seconds to limit respawn rate Dec 03 13:28:39 ha1 systemd-coredump[14956]: Process 14954 (keepalived) of user 0 dumped core.

                                         Stack trace of thread 14954:
                                         #0  0x00007fb4c112bf47 __GI_raise (libc.so.6)
                                         #1  0x00007fb4c112d98b __GI_abort (libc.so.6)
                                         #2  0x00007fb4c1176907 __libc_message (libc.so.6)
                                         #3  0x00007fb4c117d97a malloc_printerr (libc.so.6)
                                         #4  0x00007fb4c1184f3c munmap_chunk (libc.so.6)
                                         #5  0x000055fa64ad927a free_parent_mallocs_exit (keepalived)
                                         #6  0x000055fa64ad7a8d vrrp_terminate_phase2 (keepalived)
                                         #7  0x000055fa64ae63af stop_vrrp (keepalived)
                                         #8  0x000055fa64ae66dc start_vrrp_child (keepalived)
                                         #9  0x000055fa64b0f7e8 thread_call (keepalived)
                                         #10 0x000055fa64ada371 keepalived_main (keepalived)
                                         #11 0x00007fb4c110eb97 __libc_start_main (libc.so.6)
                                         #12 0x000055fa64ad82aa _start (keepalived)
pqarmitage commented 3 years ago

keepalived should not be segfaulting, and so I will have a look at that. However, the configuration is invalid, since you cannot have two vrrp instances using the same VMAC interface, so fixing the configuration should stop the segfault for you.

pqarmitage commented 3 years ago

I have just started looking at this. Not only can you not use the same VMAC interface name more than once, you cannot have a duplicate virtual_router_id on the same physical interface with the same address family (in this case IPv4).

Again you are getting a segfault in free_parent_mallocs_exit() which is very strange.

louis-6wind commented 3 years ago

The segfault is not due to the duplicated vmac, on which behaviour is fine.

I will open a different issue for the segfault

pqarmitage commented 3 years ago

If the virtual_router_id of one of the instances is changed, then when keepalived is setting up vrrp2 it detects the vrrp interface and that it has the wrong MAC address, deletes the interface and creates a new vrrp interface using the VRID of vrrp2. This behaviour is wrong and it should instead choose another name for the second VMAC.

pqarmitage commented 3 years ago

Commits 54f146f, 396b436 and 8bc1e72 resolve problems related to duplicate VMAC names.