Closed dpajin closed 4 years ago
The log above shows that the keepalived vrrp process segfaulted. Are you able to provide a stack backtrace using gdb so that we can see where the problem occurred?
Unfortunately, I don't know how to do that. Maybe you can help with the suggestion? I tried something like this, but I don't get any stack trace or I don't know where to look for it?
$ sudo gdb -q -batch -ex run -ex backtrace -ex 'thread apply all backtrace' --args /usr/local/sbin/keepalived --log-detail --vrrp --snmp
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[Inferior 1 (process 28050) exited normally]
No stack.
$ sudo gdb -q -batch -ex run -ex 'thread apply all backtrace' --args /usr/local/sbin/keepalived --log-detail --vrrp --snmp
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[Inferior 1 (process 28193) exited normally]
After the segfault occurs, a coredump should be produced. The configuration files: /etc/systemd/coredump.conf /etc/systemd/coredump.conf.d/.conf /run/systemd/coredump.conf.d/.conf /usr/lib/systemd/coredump.conf.d/*.conf should help identify the location of the corefile.
Once you have located the coredump file, run gdb <PATH_TO_KEEPALIVED> <PATH_TO_COREDUMP>
and then at the gdb prompt, type bt
. The output of that should be the stack backtrace.
I noticed in the mean time a few more thing to mentioned:
This issue described in the first comment happens when SNMP daemon is restarted. The first two state changes FIFO buffer will receive some "junk", but after that it looks normal. This "junk" characters are actually part of the string that is expected to be received, but for some reason, some random characters are omitted (missing)
When SNMP daemon is stopped for longer than 15 sec, the case is even worse. Keepalived will log the message that it cannot connect to SNMP agent, and after that it does not work as it is supposed to work. It does not detect change in tracking processes for example. Even if the SNMP is started again, keepalived is still in the same state.
May 14 16:32:07 RMIMH05S Keepalived_vrrp[5566]: AgentX master disconnected us, reconnecting in 15
May 14 16:32:07 RMIMH05S Keepalived_vrrp[5566]: scheduler: There is already read event 0x55f1251d8df0 (read 0x55f1251734c0) registered on fd [16]
May 14 16:32:22 RMIMH05S Keepalived_vrrp[5566]: Warning: Failed to connect to the agentx master agent ([NIL]):
nothing happens after this...
The other thing I found as a big issue is that I am not able to disable SNMP. Normally, I was starting keepalived as processes and using argument --snmp
:
May 12 17:05:06 RMIMH05S Keepalived[25684]: Command line: '/usr/local/sbin/keepalived' '--log-detail' '--vrrp' '--snmp'
Now I want to run keepalived without SNMP support enabled, but when I run without --snmp
argument, the connection to SNMP agent is still made and I have the same issues:
May 14 17:51:58 RMIMH05S Keepalived[21652]: Starting Keepalived v2.0.20 (01/22,2020)
May 14 17:51:58 RMIMH05S Keepalived[21652]: Running on Linux 5.3.0-46-generic #38~18.04.1-Ubuntu SMP Tue Mar 31 04:17:56 UTC 2020 (built for Linux 4.15.18)
May 14 17:51:58 RMIMH05S Keepalived[21652]: Command line: '/usr/local/sbin/keepalived' '--log-detail' '--vrrp'
May 14 17:51:58 RMIMH05S Keepalived[21652]: Opening file '/etc/keepalived/keepalived.conf'.
May 14 17:51:58 RMIMH05S Keepalived[21661]: Starting VRRP child process, pid=21662
May 14 17:51:58 RMIMH05S Keepalived_vrrp[21662]: Registering Kernel netlink reflector
May 14 17:51:58 RMIMH05S Keepalived_vrrp[21662]: Registering Kernel netlink command channel
May 14 17:51:58 RMIMH05S Keepalived_vrrp[21662]: Opening file '/etc/keepalived/keepalived.conf'.
May 14 17:51:58 RMIMH05S Keepalived_vrrp[21662]: Starting SNMP subagent
May 14 17:51:58 RMIMH05S Keepalived_vrrp[21662]: NET-SNMP version 5.7.3 AgentX subagent connected
...
Is this behavior is expected? The keepalived is compiled with --enable-snmp
to have snmp support, but I would expect to be able not to use it.
Version:
$ keepalived --version
Keepalived v2.0.20 (01/22,2020)
Copyright(C) 2001-2020 Alexandre Cassen, <acassen@gmail.com>
Built with kernel headers for Linux 4.15.18
Running on Linux 5.3.0-46-generic #38~18.04.1-Ubuntu SMP Tue Mar 31 04:17:56 UTC 2020
configure options: --enable-snmp
Config options: LIBIPTC LIBIPSET_DYNAMIC NFTABLES LVS VRRP VRRP_AUTH OLD_CHKSUM_COMPAT FIB_ROUTING SNMP_VRRP SNMP_CHECKER
System options: PIPE2 SIGNALFD INOTIFY_INIT1 VSYSLOG EPOLL_CREATE1 IPV4_DEVCONF IPV6_ADVANCED_API LIBNL3 RTA_ENCAP RTA_EXPIRES RTA_NEWDST RTA_PREF FRA_SUPPRESS_PREFIXLEN FRA_SUPPRESS_IFGROUP FRA_TUN_ID RTAX_CC_ALGO RTAX_QUICKACK RTEXT_FILTER_SKIP_STATS FRA_L3MDEV FRA_UID_RANGE RTAX_FASTOPEN_NO_COOKIE RTA_VIA FRA_OIFNAME RTA_TTL_PROPAGATE IFA_FLAGS IP_MULTICAST_ALL LWTUNNEL_ENCAP_MPLS LWTUNNEL_ENCAP_ILA LIBIPTC LIBIPSET_PRE_V7 NET_LINUX_IF_H_COLLISION LIBIPVS_NETLINK IPVS_DEST_ATTR_ADDR_FAMILY IPVS_SYNCD_ATTRIBUTES IPVS_64BIT_STATS VRRP_VMAC VRRP_IPVLAN IFLA_LINK_NETNSID CN_PROC SOCK_NONBLOCK SOCK_CLOEXEC O_PATH GLOB_BRACE INET6_ADDR_GEN_MODE VRF SO_MARK SCHED_RESET_ON_FORK
@dpajin You have snmp enabled, if you don't need it just remove the lines enable_snmp_vrrp
and enable_snmp_checker
.
@pqarmitage I'm having the same issue, with just enable_snmp_vrrp
. I need SNMP, however.
Fresh compilation of keepalived v2.0.20 on CentOS 8 (built the latest source in an RPM and compiled with the same options as the EPEL package, as follows):
Keepalived v2.0.20 (01/22,2020)
Copyright(C) 2001-2020 Alexandre Cassen, <acassen@gmail.com>
Built with kernel headers for Linux 4.18.0
Running on Linux 4.18.0-147.8.1.el8_1.x86_64 #1 SMP Thu Apr 9 13:49:54 UTC 2020
configure options: --build=x86_64-redhat-linux-gnu --host=x86_64-redhat-linux-gnu --program-prefix= --disable-dependency-tracking --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/libexec --localstatedir=/var --sharedstatedir=/var/lib --mandir=/usr/share/man --infodir=/usr/share/info --enable-snmp --enable-snmp-rfc --enable-sha1 --with-init=systemd build_alias=x86_64-redhat-linux-gnu host_alias=x86_64-redhat-linux-gnu PKG_CONFIG_PATH=:/usr/lib64/pkgconfig:/usr/share/pkgconfig CFLAGS=-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection LDFLAGS=-Wl,-z,relro -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld
Config options: LIBIPTC LIBIPSET_DYNAMIC LVS VRRP VRRP_AUTH OLD_CHKSUM_COMPAT FIB_ROUTING SNMP_V3_FOR_V2 SNMP_VRRP SNMP_CHECKER SNMP_RFCV2 SNMP_RFCV3
System options: PIPE2 SIGNALFD INOTIFY_INIT1 VSYSLOG EPOLL_CREATE1 IPV4_DEVCONF IPV6_ADVANCED_API LIBNL3 RTA_ENCAP RTA_EXPIRES RTA_NEWDST RTA_PREF FRA_SUPPRESS_PREFIXLEN FRA_SUPPRESS_IFGROUP FRA_TUN_ID RTAX_CC_ALGO RTAX_QUICKACK RTEXT_FILTER_SKIP_STATS FRA_L3MDEV FRA_UID_RANGE RTAX_FASTOPEN_NO_COOKIE RTA_VIA FRA_OIFNAME FRA_PROTOCOL FRA_IP_PROTO FRA_SPORT_RANGE FRA_DPORT_RANGE RTA_TTL_PROPAGATE IFA_FLAGS IP_MULTICAST_ALL LWTUNNEL_ENCAP_MPLS LWTUNNEL_ENCAP_ILA LIBIPTC NET_LINUX_IF_H_COLLISION LIBIPVS_NETLINK IPVS_DEST_ATTR_ADDR_FAMILY IPVS_SYNCD_ATTRIBUTES IPVS_64BIT_STATS VRRP_VMAC VRRP_IPVLAN IFLA_LINK_NETNSID CN_PROC SOCK_NONBLOCK SOCK_CLOEXEC O_PATH GLOB_BRACE INET6_ADDR_GEN_MODE VRF SO_MARK SCHED_RESET_ON_FORK
This also happens with the EPEL release, which is too old (2.0.10) for us to care right now I guess, so I was hoping the latest version would have fixed it, but no luck.
I'm going to try to generate the core dump you asked and send the info.
Ok, I was able to run it with coredump gdb
, also installed the debug infos so this should make it easier (although it still complains about some missing, they are installed). For now I got this:
Core was generated by `/usr/sbin/keepalived -D'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 __rb_erase_augmented (augment=<optimized out>, leftmost=0x55a419686308, root=0x55a419686300, node=0x55a419713258) at rbtree_augmented.h:205
205 tmp = child->rb_left;
Missing separate debuginfos, use: dnf debuginfo-install audit-libs-3.0-0.13.20190507gitf58ec40.el8.x86_64 openssl-libs-1.1.1c-2.el8_1.1.x86_64 rpm-libs-4.14.2-26.el8_1.x86_64 sssd-client-2.2.0-19.el8_1.1.x86_64
(gdb) bt
#0 __rb_erase_augmented (augment=<optimized out>, leftmost=0x55a419686308, root=0x55a419686300, node=0x55a419713258) at rbtree_augmented.h:205
#1 rb_erase_cached (node=node@entry=0x55a419713258, root=root@entry=0x55a419686300) at rbtree.c:479
#2 0x000055a418ca3c95 in thread_move_ready (type=16, thread=0x55a419713210, root=0x55a419686300, m=0x55a419686300) at scheduler.c:1718
#3 thread_fetch_next_queue (m=0x55a419686300) at scheduler.c:1718
#4 process_threads (m=0x55a419686300) at scheduler.c:1790
#5 0x000055a418ca42a5 in launch_thread_scheduler (m=<optimized out>) at scheduler.c:1942
#6 0x000055a418c6ebeb in start_vrrp_child () at vrrp_daemon.c:1047
#7 start_vrrp_child () at vrrp_daemon.c:917
#8 0x000055a418c6ec36 in vrrp_respawn_thread (thread=<optimized out>) at vrrp_daemon.c:859
#9 0x000055a418ca3996 in thread_call (thread=0x55a41968d1c0) at scheduler.c:1834
#10 process_threads (m=0x55a419686f20) at scheduler.c:1834
#11 0x000055a418ca42a5 in launch_thread_scheduler (m=<optimized out>) at scheduler.c:1942
#12 0x000055a418c4b7c4 in keepalived_main (argc=2, argv=<optimized out>) at main.c:2220
#13 0x00007efe2b65c873 in __libc_start_main (main=0x55a418c498b0 <main>, argc=2, argv=0x7fff19ca7d58, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
stack_end=0x7fff19ca7d48) at ../csu/libc-start.c:308
#14 0x000055a418c498ee in _start ()
Anything else I can provide to help fixing this?
Thanks in advance.
@araujorm Do you do a reload of the keepalived configuration before the segfault (@dpajin indicates that he has done a reload).
The stack backtrace, and the symptoms described, look very like issue #1561. Can you please try using keepalived at commit 1344729 and see if that resolves the problem.
No, it happens when I start keepalived with the enable_snmp_vrrp
option. In version 2.10 it happened every time as long as that option was active, now it just happens if keepalived is master (and it goes on in a loop, crashing and restarting, breaking hell loose since the notify scripts are launched multiple times in parallel).
Will try the commit you mentioned tomorrow.
@dpajin You have snmp enabled, if you don't need it just remove the lines enable_snmp_vrrp and enable_snmp_checker.
@araujorm, good catch, my bad! Thanks!
Hello.
Updated to commit 134472979b602128302112c69a5be0be98c36f58 but issue sitll persists, exactly as before, according to gdb in the exact same spot:
#0 __rb_erase_augmented (augment=<optimized out>, leftmost=0x55cbcda172d8, root=0x55cbcda172d0, node=0x55cbcdaa4968) at rbtree_augmented.h:205
205 tmp = child->rb_left;
(gdb) print child
$1 = (struct rb_node *) 0x322e38392e353831
(gdb) bt
#0 __rb_erase_augmented (augment=<optimized out>, leftmost=0x55cbcda172d8, root=0x55cbcda172d0, node=0x55cbcdaa4968) at rbtree_augmented.h:205
#1 rb_erase_cached (node=node@entry=0x55cbcdaa4968, root=root@entry=0x55cbcda172d0) at rbtree.c:479
#2 0x000055cbcc8afa14 in thread_move_ready (type=16, thread=0x55cbcdaa4920, root=0x55cbcda172d0, m=0x55cbcda172d0) at scheduler.c:1762
#3 thread_fetch_next_queue (m=0x55cbcda172d0) at scheduler.c:1762
#4 process_threads (m=0x55cbcda172d0) at scheduler.c:1834
#5 0x000055cbcc8b0035 in launch_thread_scheduler (m=<optimized out>) at scheduler.c:1989
#6 0x000055cbcc879c91 in start_vrrp_child () at vrrp_daemon.c:1120
#7 start_vrrp_child () at vrrp_daemon.c:990
#8 0x000055cbcc850c92 in start_keepalived (thread=<optimized out>) at main.c:530
#9 0x000055cbcc8af6ee in thread_call (thread=0x55cbcda1ce50) at scheduler.c:1882
#10 process_threads (m=0x55cbcda17ed0) at scheduler.c:1882
#11 0x000055cbcc8b0035 in launch_thread_scheduler (m=<optimized out>) at scheduler.c:1989
#12 0x000055cbcc85304f in keepalived_main (argc=2, argv=<optimized out>) at main.c:2392
#13 0x00007f3e59911873 in __libc_start_main (main=0x55cbcc850b10 <main>, argc=2, argv=0x7ffce2a36568, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
stack_end=0x7ffce2a36558) at ../csu/libc-start.c:308
#14 0x000055cbcc850b4e in _start ()
It happens every time I use enable_snmp_vrrp
.
I think I know when this is happening: we have a notify script that restarts some services when the state changes. One of those is the snmp service (to ensure cacti and alikes don't start messing up when openvpn tunnels are also restarted). This used to work fine in older versions of keepalived like 1.3.5, and keepalived recovered the masterx connection fine, however in recent versions keepalived vrrp process just segfaults.
Excerpt from the log when the crash occurs (the notify script sends its output to syslog with the tag keepalived-change
:
May 16 19:51:12 machine Keepalived_vrrp[6107]: Sending gratuitous ARP on <secure_cut_iface> for <secure_cut_ip>
May 16 19:51:12 machine keepalived-change[6416]: Redirecting to /bin/systemctl restart snmpd.service
May 16 19:51:12 machine systemd[1]: Stopping Simple Network Management Protocol (SNMP) Daemon....
May 16 19:51:12 machine snmpd[4393]: Received TERM or STOP signal... shutting down...
May 16 19:51:12 machine Keepalived_vrrp[6107]: AgentX master disconnected us, reconnecting in 15
May 16 19:51:12 machine Keepalived_vrrp[6107]: scheduler: There is already read event 0x55cbcdaa4380 (read 0x55cbcdaa4280) registered on fd [16]
May 16 19:51:12 machine kernel: traps: keepalived[6107] general protection fault ip:55cbcc8b63e1 sp:7ffce2a358d8 error:0 in keepalived[55cbcc846000+9b000]
May 16 19:51:12 machine systemd[1]: Stopped Simple Network Management Protocol (SNMP) Daemon..
May 16 19:51:12 machine systemd[1]: Starting Simple Network Management Protocol (SNMP) Daemon....
May 16 19:51:12 machine systemd[1]: Started Process Core Dump (PID 6427/UID 0).
May 16 19:51:12 machine snmpd[6430]: Turning on AgentX master support.
May 16 19:51:12 machine snmpd[6430]: Turning on AgentX master support.
May 16 19:51:12 machine snmpd[6430]: NET-SNMP version 5.8
May 16 19:51:12 machine systemd[1]: Started Simple Network Management Protocol (SNMP) Daemon..
(...)
May 16 19:51:13 machine Keepalived[6106]: pid 6107 exited due to segmentation fault (SIGSEGV).
May 16 19:51:13 machine Keepalived[6106]: Please report a bug at https://github.com/acassen/keepalived/issues
May 16 19:51:13 machine Keepalived[6106]: and include this log from when keepalived started, a description
May 16 19:51:13 machine Keepalived[6106]: of what happened before the crash, your configuration file and the details below.
May 16 19:51:13 machine Keepalived[6106]: Also provide the output of keepalived -v, what Linux distro and version
May 16 19:51:13 machine Keepalived[6106]: you are running on, and whether keepalived is being run in a container or VM.
May 16 19:51:13 machine Keepalived[6106]: A failure to provide all this information may mean the crash cannot be investigated.
May 16 19:51:13 machine Keepalived[6106]: If you are able to provide a stack backtrace with gdb that would really help.
May 16 19:51:13 machine Keepalived[6106]: Source version 2.0.20
May 16 19:51:13 machine Keepalived[6106]: Built with kernel headers for Linux 4.18.0
May 16 19:51:13 machine Keepalived[6106]: Running on Linux 4.18.0-147.8.1.el8_1.x86_64 #1 SMP Thu Apr 9 13:49:54 UTC 2020
May 16 19:51:13 machine Keepalived[6106]: Command line: '/usr/sbin/keepalived' '-D'
May 16 19:51:13 machine Keepalived[6106]: configure options: --build=x86_64-redhat-linux-gnu --host=x86_64-redhat-linux-gnu --program-prefix=
May 16 19:51:13 machine Keepalived[6106]: --disable-dependency-tracking --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin
May 16 19:51:13 machine Keepalived[6106]: --sbindir=/usr/sbin --sysconfdir=/etc --datadir=/usr/share
May 16 19:51:13 machine Keepalived[6106]: --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/libexec
May 16 19:51:13 machine Keepalived[6106]: --localstatedir=/var --sharedstatedir=/var/lib --mandir=/usr/share/man
May 16 19:51:13 machine Keepalived[6106]: --infodir=/usr/share/info --enable-snmp --enable-snmp-rfc --enable-sha1
May 16 19:51:13 machine Keepalived[6106]: --with-init=systemd build_alias=x86_64-redhat-linux-gnu
May 16 19:51:13 machine Keepalived[6106]: host_alias=x86_64-redhat-linux-gnu
May 16 19:51:13 machine Keepalived[6106]: PKG_CONFIG_PATH=:/usr/lib64/pkgconfig:/usr/share/pkgconfig CFLAGS=-O2 -g -pipe
May 16 19:51:13 machine Keepalived[6106]: -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS
May 16 19:51:13 machine Keepalived[6106]: -fexceptions -fstack-protector-strong -grecord-gcc-switches
May 16 19:51:13 machine Keepalived[6106]: -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1
May 16 19:51:13 machine Keepalived[6106]: -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic
May 16 19:51:13 machine Keepalived[6106]: -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection
May 16 19:51:13 machine Keepalived[6106]: LDFLAGS=-Wl,-z,relro -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld
May 16 19:51:13 machine Keepalived[6106]: Config options: LIBIPSET_DYNAMIC LVS VRRP VRRP_AUTH OLD_CHKSUM_COMPAT FIB_ROUTING SNMP_V3_FOR_V2
May 16 19:51:13 machine Keepalived[6106]: SNMP_VRRP SNMP_CHECKER SNMP_RFCV2 SNMP_RFCV3
May 16 19:51:13 machine Keepalived[6106]: System options: PIPE2 SIGNALFD INOTIFY_INIT1 VSYSLOG EPOLL_CREATE1 IPV4_DEVCONF IPV6_ADVANCED_API
May 16 19:51:13 machine Keepalived[6106]: LIBNL3 RTA_ENCAP RTA_EXPIRES RTA_NEWDST RTA_PREF FRA_SUPPRESS_PREFIXLEN
May 16 19:51:13 machine Keepalived[6106]: FRA_SUPPRESS_IFGROUP FRA_TUN_ID RTAX_CC_ALGO RTAX_QUICKACK RTEXT_FILTER_SKIP_STATS
May 16 19:51:13 machine Keepalived[6106]: FRA_L3MDEV FRA_UID_RANGE RTAX_FASTOPEN_NO_COOKIE RTA_VIA FRA_OIFNAME FRA_PROTOCOL
May 16 19:51:13 machine Keepalived[6106]: FRA_IP_PROTO FRA_SPORT_RANGE FRA_DPORT_RANGE RTA_TTL_PROPAGATE IFA_FLAGS
May 16 19:51:13 machine Keepalived[6106]: IP_MULTICAST_ALL LWTUNNEL_ENCAP_MPLS LWTUNNEL_ENCAP_ILA IPTABLES
May 16 19:51:13 machine Keepalived[6106]: NET_LINUX_IF_H_COLLISION LIBIPVS_NETLINK IPVS_DEST_ATTR_ADDR_FAMILY
May 16 19:51:13 machine Keepalived[6106]: IPVS_SYNCD_ATTRIBUTES IPVS_64BIT_STATS VRRP_VMAC VRRP_IPVLAN IFLA_LINK_NETNSID CN_PROC
May 16 19:51:13 machine Keepalived[6106]: SOCK_NONBLOCK SOCK_CLOEXEC O_PATH GLOB_BRACE INET6_ADDR_GEN_MODE VRF SO_MARK
May 16 19:51:13 machine Keepalived[6106]: SCHED_RESET_ON_FORK
May 16 19:51:13 machine Keepalived[6106]: VRRP child process(6107) died: Respawning
So note the part with There is already read event
and then the segfault, I think that's where the problem should be residing?
Hello.
I've created a simple way to always reproduce this issue, with the following config files.
/etc/keepalived/keepalived.conf
:
global_defs {
router_id SNMPCRASH
script_user root
enable_snmp_vrrp
}
vrrp_instance vrrp1 { state BACKUP interface enp1s0 # <-- CHANGE HERE TO MEET YOUR INTERFACE virtual_router_id 20 advert_int 1 authentication { auth_type AH auth_pass somthing } notify /etc/keepalived/keepalived-change.sh }
* `/etc/keepalived/keepalived-change.sh`:
systemctl restart snmpd exit 0
(don't forget to `chmod +x /etc/keepalived/keepalived-change.sh`)
Next, ensure you have snmpd installed (net-snmpd package on RHEL based OSes), add `master agentx` to `/etc/snmp/snmpd.conf` (or equivalent on your distro), and start snmpd (e.g. `systemctl start snmp`). Then fire keepalived on foreground with:
keepalived -D -n -l
Results of the crash loop, as posted before, should be almost instantaneous.
Reproducible on all versions of keepalived (including master) at least since version 2.0.10. Confirmed on fresh CentOS 8 installation, fully updated, with keepalived that comes with epel, also one built from commit 134472979b602128302112c69a5be0be98c36f58 and also one built from master HEAD (currently ab568a70c5d36c8cfe7b23b24a1891540ed479fa), all same result.
@araujorm Many thanks for the info. I will have a look at this in the next few days.
Commit 616ad32 resolves this issue.
Describe the bug I am using SNMP with keepalived and VRRP and notify fifo script. When snmpd is restarted, keepalived_vrrp is restarted, but after the new start, vrrp notify fifo script do not receive usual "INSTANCE ", but receives some "junk", like
fifo received: NTNE"pnt"BCU 4
Here is the log:
To Reproduce Any steps necessary to reproduce the behaviour:
Expected behavior A clear and concise description of what you expected to happen.
Keepalived version Output of
keepalived -v
2.0.20Distro (please complete the following information):
Details of any containerisation or hosted service (e.g. AWS) If keepalived is being run in a container or on a hosted service, provide full details
Configuration file: A full copy of the configuration file, obfuscated if necessary to protect passwords and IP addresses
Notify and track scripts If any notify or track scripts are in use, please provide copies of them
System Log entries Full keepalived system log entries from when keepalived started
Did keepalived coredump? If so, can you please provide a stacktrace from the coredump, using gdb.
Additional context Add any other context about the problem here.