acassen / keepalived

Keepalived
https://www.keepalived.org
GNU General Public License v2.0
4.01k stars 736 forks source link

Compiling and installing keepalived can be started from the command line but cannot be managed by systemcl. #2463

Closed liyingxiao94 closed 3 months ago

liyingxiao94 commented 3 months ago

Describe the bug A clear and concise description of what the bug is.

To Reproduce Any steps necessary to reproduce the behaviour:

Expected behavior A clear and concise description of what you expected to happen. systemctl can manage keepalived normally

Keepalived version

Output of `keepalived -v`

Keepalived v2.2.8 (04/04,2023), git commit v2.2.7-154-g292b299e+

Distro (please complete the following information):

Details of any containerisation or hosted service (e.g. AWS) If keepalived is being run in a container or on a hosted service, provide full details

Configuration file:

A full copy of the configuration file, obfuscated if necessary to protect passwords and IP addresses

cat >> /usr/local/keepalived/etc/keepalived/keepalived.conf << EOF global_defs { router_id tidb2 # 虚拟路由名称,可以替换成本机IP script_user root enable_script_security max_auto_priority 1 }

HAProxy健康检查配置

vrrp_script chk_haproxy { script "/etc/keepalived/chk_haproxy.sh" # 使用killall -0检查haproxy实例是否存在,性能高于ps命令 interval 5 # 脚本运行周期,秒 }

虚拟路由配置

vrrp_instance VI_cubc_his { state BACKUP # 本机实例状态,MASTER/BACKUP,备机配置文件中请写BACKUP interface bond0 # 本机网卡名称,使用ifconfig命令查看 virtual_router_id 218 # 虚拟路由编号,主备机保持一致 priority 100 # 本机初始权重,备机请填写小于主机的值(例如99) nopreempt advert_int 1 # 争抢虚地址的周期,秒 authentication { auth_type PASS auth_pass 4gAtFde9 # 认证类型和密码主备一样,要不然无法互相认证 } virtual_ipaddress { xx.xxx.xxx.xxx # 虚地址IP,主备机保持一致 } track_script { chk_haproxy # 对应的健康检查配置 } } EOF

Notify and track scripts

If any notify or track scripts are in use, please provide copies of them

System Log entries

Full keepalived system log entries from when keepalived started

● keepalived.service - LVS and VRRP High Availability Monitor Loaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled) Active: failed (Result: signal) since Fri 2024-08-16 15:49:45 CST; 2s ago Docs: man:keepalived(8) man:keepalived.conf(5) man:genhash(1) https://keepalived.org Process: 82750 ExecStart=/usr/local/keepalived/sbin/keepalived --dont-fork $KEEPALIVED_OPTIONS (code=killed, signal=KILL) Main PID: 82750 (code=killed, signal=KILL)

8月 16 15:49:42 tidb3 Keepalived_vrrp[82751]: Registering Kernel netlink reflector 8月 16 15:49:42 tidb3 Keepalived_vrrp[82751]: Registering Kernel netlink command channel 8月 16 15:49:42 tidb3 Keepalived_vrrp[82751]: Assigned address xx.xxx.xx.xx for interface bond0 8月 16 15:49:42 tidb3 Keepalived_vrrp[82751]: Registering gratuitous ARP shared channel 8月 16 15:49:42 tidb3 Keepalived_vrrp[82751]: (VI_cubc_his) removing VIPs. 8月 16 15:49:42 tidb3 Keepalived[82750]: Startup complete 8月 16 15:49:42 tidb3 systemd[1]: Started LVS and VRRP High Availability Monitor. 8月 16 15:49:42 tidb3 Keepalived_vrrp[82751]: VRRP sockpool: [ifindex( 6), family(IPv4), proto(112), fd(14,15) multicast, address(224.0.0.18)] 8月 16 15:49:45 tidb3 systemd[1]: keepalived.service: Main process exited, code=killed, status=9/KILL 8月 16 15:49:45 tidb3 systemd[1]: keepalived.service: Failed with result 'signal'.

Did keepalived coredump?

If so, can you please provide a stacktrace from the coredump, using gdb.

Additional context Add any other context about the problem here.

liyingxiao94 commented 3 months ago

This will start : /usr/local/keepalived/sbin/keepalived -f /etc/keepalived/keepalived.conf systemctl cannot start:systemctl start keepalived.service

liyingxiao94 commented 3 months ago

img_v3_02dq_d1b771dd-4444-4c93-86a5-4cf4dd62b56g

pqarmitage commented 3 months ago

Most of the questions above (in bold) you have not answered (or you have only partially answered them), and without the relevant answers we cannot determine what is happening.

Please provide full answers to the questions (e.g. what distro are you running on, and the output of keepalived -v should be about 14 lines long rather than the single line you have provided). Please also provide a copy of your keepalived.service file.

liyingxiao94 commented 3 months ago

keepalived -v Keepalived v2.2.8 (04/04,2023), git commit v2.2.7-154-g292b299e+

Copyright(C) 2001-2023 Alexandre Cassen, acassen@gmail.com

Built with kernel headers for Linux 3.10.0 Running on Linux tidb1 4.19.90-23.48.v2101.ky10.x86_64 #1 SMP Tue Jun 4 20:02:42 CST 2024 x86_64 x86_64 x86_64 GNU/Linux Distro: CentOS Linux 7 (Core)

configure options: --prefix=/usr/local/keepalived

Config options: LVS VRRP VRRP_AUTH VRRP_VMAC OLD_CHKSUM_COMPAT INIT=systemd

System options: VSYSLOG LIBNL3 RTA_ENCAP RTA_EXPIRES RTA_PREF FRA_SUPPRESS_PREFIXLEN FRA_TUN_ID RTAX_CC_ALGO RTAX_QUICKACK RTA_VIA IFA_FLAGS NET_LINUX_IF_H_COLLISION LIBIPTC_LINUX_NET_IF_H_COLLISION LIBIPVS_NETLINK IFLA_LINK_NETNSID GLOB_BRACE GLOB_ALTDIRFUNC INET6_ADDR_GEN_MODE SO_MARK

pqarmitage commented 3 months ago

We really cannot investigate this unless you provide the information requested.

liyingxiao94 commented 3 months ago

The important information has been uploaded, what information is still needed.

pqarmitage commented 3 months ago

On a Redhat based system (e.g. CentOS) the standard keepalived service file (via /etc/sysconfig/keepalived) does not specify --dont-fork, so something would appear to be modified on your system. Also the mainstream CentOS 7 kernel is 3.10 (hence keepalived has been built against 3.10 header files) but you appear to running a 4.19 kernel.

Your system would appear to be a somewhat modified CentOS 7 system, so it is going to be extremely hard for us to identify the cause of your problems.

liyingxiao94 commented 3 months ago

Yes, the operating system is Kylin v10, I tried to find some key information from the log, but couldn't find it. Keepalived seemed to be killed by something. Is there any way to locate it?

pqarmitage commented 3 months ago

Can you please post your /lib/systemd/system/keepalived.service file, and also /etc/sysconfig/keepalived. That might give us a clue regarding what is killing keepalived.

Since you are running with a 4.19 kernel, you really should have the matching kernel headers installed, rather than the CentOS 3.10 kernel headers. Your build of keepalived would then at least match the running kernel. This isn't the cause of keepalived being killed, but you would get a more functional keepalived.

liyingxiao94 commented 3 months ago

There is no file /etc/sysconfig/keepalived,but /usr/local/keepalived/etc/sysconfig/keepalived exists cat /usr/local/keepalived/etc/sysconfig/keepalived

# Options for keepalived. See `keepalived --help' output and keepalived(8) and
# keepalived.conf(5) man pages for a list of all options. Here are the most
# common ones :
#
# --vrrp               -P    Only run with VRRP subsystem.
# --check              -C    Only run with Health-checker subsystem.
# --dont-release-vrrp  -V    Dont remove VRRP VIPs & VROUTEs on daemon stop.
# --dont-release-ipvs  -I    Dont remove IPVS topology on daemon stop.
# --dump-conf          -d    Dump the configuration data.
# --log-detail         -D    Detailed log messages.
# --log-facility       -S    0-7 Set local syslog facility (default=LOG_DAEMON)
#
KEEPALIVED_OPTIONS="-D"

cat /usr/lib/systemd/system/keepalived.service

[Unit]
Description=LVS and VRRP High Availability Monitor
After=network-online.target syslog.target 
Wants=network-online.target 
Documentation=man:keepalived(8)
Documentation=man:keepalived.conf(5)
Documentation=man:genhash(1)
Documentation=https://keepalived.org

[Service]
Type=forking
PIDFile=/run/keepalived.pid
KillMode=process
EnvironmentFile=-/usr/local/keepalived/etc/sysconfig/keepalived
ExecStart=/usr/local/keepalived/sbin/keepalived  $KEEPALIVED_OPTIONS
ExecReload=/bin/kill -HUP $MAINPID

[Install]
WantedBy=multi-user.target

Yes, the same keepalived version uses the same compilation and installation method. It can be managed by systemctl in kernel version 3.10.0-1160.53.1.el7.x86_64, but not in 4.19.90-23.48.v2101.ky10.x86_64.

pqarmitage commented 3 months ago

Although the keepalived.service file (and the associated sysconfig/keepalived file) don't specify --dont-fork for some reason the log entries above show that keepalived is being run with the --dont-fork option. The keepalived.service file however is specifying Type=forking which conflicts with the --dont-fork option. This is why systemd is killing keepalived after it starts up.

You will need to sort out why your system is doing this.