acassen / keepalived

Keepalived
https://www.keepalived.org
GNU General Public License v2.0
3.95k stars 736 forks source link

can't work in LXC - "Unable to load xt_set module" #1852

Closed rnz closed 3 years ago

rnz commented 3 years ago

Describe the bug keepalived doesn't work inside LXC container

Feb 17 10:11:38 rhost1 Keepalived_vrrp[3037]: Unable to load xt_set module

To Reproduce create lxc container (debian 9 on debian 9 host) inside lxc container install keepalived look logs

Expected behavior keepalived start and work

Keepalived version

# keepalived -v
Keepalived v1.3.2 (12/03,2016)

Copyright(C) 2001-2016 Alexandre Cassen, <acassen@gmail.com>

Build options:  PIPE2 LIBNL3 RTA_ENCAP RTA_EXPIRES RTA_NEWDST RTA_PREF RTA_VIA FRA_OIFNAME FRA_SUPPRESS_PREFIXLEN FRA_SUPPRESS_IFGROUP FRA_TUN_ID RTAX_CC_ALGO RTAX_QUICKACK IPV4_DEVCONF LIBIPTC LIBIPSET LVS LIBIPVS_NETLINK IPVS_DEST_ATTR_ADDR_FAMILY IPVS_SYNCD_ATTRIBUTES IPVS_64BIT_STATS VRRP VRRP_AUTH VRRP_VMAC SOCK_NONBLOCK SOCK_CLOEXEC FIB_ROUTING INET6_ADDR_GEN_MODE SNMP_V3_FOR_V2 SNMP SNMP_KEEPALIVED SNMP_CHECKER SNMP_RFC SNMP_RFCV2 SNMP_RFCV3 SO_MARK

# apt-cache policy keepalived 
keepalived:
  Installed: 1:1.3.2-1
  Candidate: 1:1.3.2-1
  Version table:
 *** 1:1.3.2-1 500
        500 http://ftp.debian.org/debian stretch/main amd64 Packages
        100 /var/lib/dpkg/status

Distro (please complete the following information):

Details of any containerisation or hosted service (e.g. AWS) Proxmox 5.4

# pveversion 
pve-manager/5.4-15/d0ec33c6 (running kernel: 4.15.18-30-pve)

# cat /etc/pve/lxc/204.conf
arch: amd64
cores: 64
hostname: rhost1
memory: 8192
net0: name=eth2,bridge=vmbr1,hwaddr=<skip>,ip=<skip>,type=veth
net1: name=eth1,bridge=vmbr2,firewall=1,gw=<skip>,hwaddr=<skip>,ip=<skip>,type=veth
net2: name=eth0,bridge=vmbr3,hwaddr=<skip>,ip=<skip>,type=veth
onboot: 1
ostype: debian
rootfs: zs5-zfs-pool1:subvol-204-disk-1,size=20G
swap: 0

# cat /var/lib/lxc/204/config
lxc.arch = amd64
lxc.include = /usr/share/lxc/config/debian.common.conf
lxc.apparmor.profile = generated
lxc.apparmor.raw = deny mount -> /proc/,
lxc.apparmor.raw = deny mount -> /sys/,
lxc.monitor.unshare = 1
lxc.tty.max = 2
lxc.environment = TERM=linux
lxc.uts.name = rhost1
lxc.cgroup.memory.limit_in_bytes = 8589934592
lxc.cgroup.memory.memsw.limit_in_bytes = 8589934592
lxc.cgroup.cpu.shares = 1024
lxc.rootfs.path = /var/lib/lxc/204/rootfs
lxc.net.0.type = veth
lxc.net.0.veth.pair = veth204i0
lxc.net.0.hwaddr = <skip>
lxc.net.0.name = eth2
lxc.net.1.type = veth
lxc.net.1.veth.pair = veth204i1
lxc.net.1.hwaddr = <skip>
lxc.net.1.name = eth1
lxc.net.2.type = veth
lxc.net.2.veth.pair = veth204i2
lxc.net.2.hwaddr = <skip>
lxc.net.2.name = eth0
lxc.cgroup.cpuset.cpus = 3,17,19,22,24,28-31,33,41,44-45,47,54,58-59,61-62,64,69,77-78,80,84-85,96-97,101,111-112,115,119,125,127-128,131-132,135,139,145-146,151,153,165,168,173-174,180,184,200,203,207,213,217-218,224,229-230,232,235,238,241,249

Configuration file:

global_defs {
  lvs_id LVS_1
! UNIQUE:
   router_id rhost1
}

vrrp_script haproxy {
  #script "/usr/bin/pkill -0 haproxy" 
  script "/usr/bin/pkill -0 nginx" 
  interval 2
  weight 2
}

vrrp_instance ext_v1 {
  virtual_router_id 1
  advert_int 2
  priority 100
  state MASTER
  interface eth1
  authentication {
        auth_type PASS
        auth_pass 8282
  }
  virtual_ipaddress {
       <skip> dev eth1
  }
  virtual_routes {
        0.0.0.0/0 via <skip>
  }
}

vrrp_instance ext_v2 {
  virtual_router_id 2
  advert_int 2
  priority 99
  state BACKUP
  interface eth1
  authentication {
        auth_type PASS
        auth_pass 8282  
  }
  virtual_ipaddress {
        <skip> dev eth1
  }
}

vrrp_instance ext_v3 {
  virtual_router_id 3
  advert_int 2
  priority 98
  state BACKUP
  interface eth1
  authentication {
        auth_type PASS
        auth_pass 8282  
  }
  virtual_ipaddress {
        <skip> dev eth1
  }
}

include /etc/keepalived/conf.t/*.conf

System Log entries

Feb 17 10:11:38 rhost1 systemd[1]: keepalived.service: Trying to enqueue job keepalived.service/start/replace
Feb 17 10:11:38 rhost1 systemd[1]: keepalived.service: Installed new job keepalived.service/start as 786
Feb 17 10:11:38 rhost1 systemd[1]: keepalived.service: Enqueued job keepalived.service/start as 786
Feb 17 10:11:38 rhost1 systemd[1]: keepalived.service: ConditionFileNotEmpty=/etc/keepalived/keepalived.conf succeeded.
Feb 17 10:11:38 rhost1 systemd[1]: keepalived.service: Failed to reset devices.list: Operation not permitted
Feb 17 10:11:38 rhost1 systemd[1]: keepalived.service: Passing 0 fds to service
Feb 17 10:11:38 rhost1 systemd[1]: keepalived.service: About to execute: /usr/sbin/keepalived $DAEMON_ARGS
Feb 17 10:11:38 rhost1 systemd[1]: keepalived.service: Forked /usr/sbin/keepalived as 3034
Feb 17 10:11:38 rhost1 systemd[1]: keepalived.service: Changed dead -> start
Feb 17 10:11:38 rhost1 systemd[1]: Starting Keepalive Daemon (LVS and VRRP)...
Feb 17 10:11:38 rhost1 systemd[3034]: keepalived.service: Executing: /usr/sbin/keepalived
Feb 17 10:11:38 rhost1 Keepalived[3034]: Starting Keepalived v1.3.2 (12/03,2016)
Feb 17 10:11:38 rhost1 Keepalived[3034]: WARNING - default user 'keepalived_script' for script execution does not exist - please create.
Feb 17 10:11:38 rhost1 Keepalived[3034]: Opening file '/etc/keepalived/keepalived.conf'.
Feb 17 10:11:38 rhost1 systemd[1]: keepalived.service: Child 3034 belongs to keepalived.service
Feb 17 10:11:38 rhost1 systemd[1]: keepalived.service: Control process exited, code=exited status=0
Feb 17 10:11:38 rhost1 systemd[1]: keepalived.service: Got final SIGCHLD for state start.
Feb 17 10:11:38 rhost1 systemd[1]: keepalived.service: Main PID guessed: 3035
Feb 17 10:11:38 rhost1 systemd[1]: keepalived.service: Changed start -> running
Feb 17 10:11:38 rhost1 systemd[1]: keepalived.service: Job keepalived.service/start finished, result=done
Feb 17 10:11:38 rhost1 systemd[1]: Started Keepalive Daemon (LVS and VRRP).
Feb 17 10:11:38 rhost1 Keepalived[3035]: Starting Healthcheck child process, pid=3036
Feb 17 10:11:38 rhost1 systemd[1]: keepalived.service: Failed to send unit change signal for keepalived.service: Connection reset by peer
Feb 17 10:11:38 rhost1 Keepalived_healthcheckers[3036]: Initializing ipvs
Feb 17 10:11:38 rhost1 Keepalived[3035]: Starting VRRP child process, pid=3037
Feb 17 10:11:38 rhost1 Keepalived_healthcheckers[3036]: Registering Kernel netlink reflector
Feb 17 10:11:38 rhost1 Keepalived_vrrp[3037]: Registering Kernel netlink reflector
Feb 17 10:11:38 rhost1 Keepalived_healthcheckers[3036]: Registering Kernel netlink command channel
Feb 17 10:11:38 rhost1 Keepalived_vrrp[3037]: Registering Kernel netlink command channel
Feb 17 10:11:38 rhost1 Keepalived_healthcheckers[3036]: Opening file '/etc/keepalived/keepalived.conf'.
Feb 17 10:11:38 rhost1 Keepalived_vrrp[3037]: Registering gratuitous ARP shared channel
Feb 17 10:11:38 rhost1 Keepalived_vrrp[3037]: Opening file '/etc/keepalived/keepalived.conf'.
Feb 17 10:11:38 rhost1 Keepalived_healthcheckers[3036]: Opening file '/etc/keepalived/conf.t/7073.conf'.
Feb 17 10:11:38 rhost1 Keepalived_vrrp[3037]: Opening file '/etc/keepalived/conf.t/7073.conf'.
Feb 17 10:11:38 rhost1 Keepalived_healthcheckers[3036]: Using LinkWatch kernel netlink reflector...
Feb 17 10:11:38 rhost1 Keepalived_vrrp[3037]: Unable to load xt_set module
Feb 17 10:11:38 rhost1 Keepalived_vrrp[3037]: Using LinkWatch kernel netlink reflector...
pqarmitage commented 3 years ago

Due to the nature of containers, and their designed separation from the host system, it is simply not possible to load a kernel module from within a container. The solution is for you to load the kernel module(s) required by keepalived prior to starting keepalived in the container. Probably the simplest way to do this if you want it done permanently is to add a file such as keepalived.conf in /etc/modprobe.d and specify the required modules.

The modules that keepalived currently attempts to load if they are not currently loaded are xt_set and ip_vs (although I suspect it should also load the ip_tables module too if it is not loaded, and possibly the nf_tables module - I will need to check these).

BTW the version of keepalived you are using is extremely old, and there have been thousands of improvements since v1.3.2. The current version is v2.2.1.

rnz commented 3 years ago

@pqarmitage modules already preloaded on host:

# lsmod | grep xt_set
xt_set                 16384  94
ip_set                 40960  2 xt_set,ip_set_hash_net
x_tables               40960  21 ebtables,ip6table_filter,xt_conntrack,iptable_filter,xt_multiport,xt_tcpudp,ipt_MASQUERADE,xt_addrtype,xt_CHECKSUM,xt_physdev,xt_nat,xt_ipvs,xt_comment,xt_set,ip6_tables,ipt_REJECT,ip_tables,ip6t_REJECT,iptable_mangle,xt_REDIRECT,xt_mark

On all hosts in proxmox cluster:

# egrep -v '^#|^$' /etc/modules
bonding
overlay
ip_vs
ip_vs_dh
ip_vs_ftp
ip_vs_lblc
ip_vs_lblcr
ip_vs_lc
ip_vs_nq
ip_vs_rr
ip_vs_sed
ip_vs_sh
ip_vs_wlc
ip_vs_wrr
xfrm_user
nf_nat
br_netfilter
xt_conntrack
xt_set

Same problem is present in docker containers or in kubernetes containers.

Problem is not solved and need more attention.

pqarmitage commented 3 years ago

I hadn't remembered, since v1.3.2 is so old, but I have changed the code around loading the xt_set module since v1.3.2 more than once. So far as I can see, you need keepalived v2.0.20 or later, but since you will presumably have to build the code yourself, you will be best off using the latest version, v2.2.1.