polym / issues

0 stars 0 forks source link

nf_conntrack_max 没有生效问题 #2

Open polym opened 5 years ago

polym commented 5 years ago

背景

把 Centos7 内核从 4.10.1 更新到 4.20 后,发现 kubelet 无法正常部署 Pod,提示错误

Error while adding to cni network: failed to set bridge addr: could not add IP address to "kube-bridge": file exists

查了下实在找不到原因,重启机器,发现 nf_conntrack_max 跟 sysctl.conf 设置里的不一样了(因为最近出现过几次,所以比较敏感),并且执行 sysctl -p 可以设置成功。之前线上也出现过类似情况,第一直觉跟 conntrack 相关的模块没加载有关。于是做了下比对:

# lsmod | grep conntrack
nf_conntrack_ipv4      16384  7
nf_defrag_ipv4         16384  1 nf_conntrack_ipv4
xt_conntrack           16384  1
nf_conntrack          135168  7 ip_vs,nf_conntrack_ipv4,ipt_MASQUERADE,nf_nat_masquerade_ipv4,xt_conntrack,nf_nat_ipv4,nf_nat
# lsmod | grep conntrack
nf_conntrack_netlink    40960  0
nfnetlink              16384  2 ip_set,nf_conntrack_netlink
nf_conntrack_ipv4      16384  7
nf_defrag_ipv4         16384  1 nf_conntrack_ipv4
xt_conntrack           16384  1
nf_conntrack          135168  9 ip_vs,xt_nat,nf_conntrack_ipv4,ipt_MASQUERADE,nf_conntrack_netlink,nf_nat_masquerade_ipv4,xt_conntrack,nf_nat_ipv4,nf_nat
polym commented 5 years ago

排查-1

根据前面提到的「手动 sysctl -p 可以把 nf_conntrack_max 设置上」,怀疑是开机启动的时候 sysctl 生效,但是被后面某些模块启动覆盖掉了,实在没办法,只能在 /etc/rc.local 强制写入 sysctl -p。重启。

重启后,发现还是没有设置成功,增加 sysctl -p > /tmp/sysctl.conf 2>&1。重启后查看日志,果然有问题,

/tmp/sysctl.log

sysctl: cannot stat /proc/sys/net/netfilter/nf_conntrack_max: No such file or directory
sysctl: cannot stat /proc/sys/net/netfilter/nf_conntrack_buckets: No such file or directory
sysctl: cannot stat /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_close_wait: No such file or directory
sysctl: cannot stat /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_fin_wait: No such file or directory
sysctl: cannot stat /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_time_wait: No such file or directory
sysctl: cannot stat /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_established: No such file or directory
sysctl: cannot stat /proc/sys/net/netfilter/nf_conntrack_udp_timeout: No such file or directory
sysctl: cannot stat /proc/sys/net/netfilter/nf_conntrack_udp_timeout_stream: No such file or directory

messages 里信息更加丰富

messages:Feb 19 15:11:44 K8S-ZJ-FUD-59 rc.local: sysctl: cannot stat /proc/sys/net/netfilter/nf_conntrack_max: No such file or directory
messages:Feb 19 15:11:44 K8S-ZJ-FUD-59 rc.local: sysctl: cannot stat /proc/sys/net/netfilter/nf_conntrack_buckets: No such file or directory
messages:Feb 19 15:11:44 K8S-ZJ-FUD-59 rc.local: sysctl: cannot stat /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_close_wait: No such file or directory
messages:Feb 19 15:11:44 K8S-ZJ-FUD-59 rc.local: sysctl: cannot stat /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_fin_wait: No such file or directory
messages:Feb 19 15:11:44 K8S-ZJ-FUD-59 rc.local: sysctl: cannot stat /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_time_wait: No such file or directory
messages:Feb 19 15:11:44 K8S-ZJ-FUD-59 rc.local: sysctl: cannot stat /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_established: No such file or directory
messages:Feb 19 15:11:44 K8S-ZJ-FUD-59 rc.local: sysctl: cannot stat /proc/sys/net/netfilter/nf_conntrack_udp_timeout: No such file or directory
messages:Feb 19 15:11:44 K8S-ZJ-FUD-59 rc.local: sysctl: cannot stat /proc/sys/net/netfilter/nf_conntrack_udp_timeout_stream: No such file or directory
messages:Feb 19 15:11:45 K8S-ZJ-FUD-59 kernel: nf_conntrack version 0.5.0 (65536 buckets, 262144 max)

可以发现是,结束 /etc/rc.local 后,nf_conntrack 还没有被加载。

polym commented 5 years ago

继续排查

异常机器上的 message

# grep -3 conntrack /var/log/messages
Feb 19 17:21:41 K8S-ZJ-FUD-59 systemd: kdump.service failed.
Feb 19 17:21:41 K8S-ZJ-FUD-59 dockerd: time="2019-02-19T17:21:41.717474971+08:00" level=info msg="Graph migration to content-addressability took 0.00 seconds"
Feb 19 17:21:41 K8S-ZJ-FUD-59 dockerd: time="2019-02-19T17:21:41.718420537+08:00" level=info msg="Loading containers: start."
Feb 19 17:21:41 K8S-ZJ-FUD-59 kernel: nf_conntrack version 0.5.0 (65536 buckets, 262144 max)
Feb 19 17:21:41 K8S-ZJ-FUD-59 dockerd: time="2019-02-19T17:21:41.752767154+08:00" level=info msg="Firewalld running: false"
Feb 19 17:21:41 K8S-ZJ-FUD-59 kernel: IPv6: ADDRCONF(NETDEV_UP): docker0: link is not ready
Feb 19 17:21:41 K8S-ZJ-FUD-59 dockerd: time="2019-02-19T17:21:41.859042685+08:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"

正常机器上的 message

# grep -3 conntrack /var/log/messages-*
/var/log/messages-20190215-Feb 14 16:37:32 DCK-ZJ-FUD-134 kernel: [    8.771257] XFS (sde1): nobarrier option is deprecated, ignoring.
/var/log/messages-20190215-Feb 14 16:37:32 DCK-ZJ-FUD-134 kernel: [    8.772489] XFS (sde1): Mounting V4 Filesystem
/var/log/messages-20190215-Feb 14 16:37:32 DCK-ZJ-FUD-134 kernel: [    8.793512] XFS (sde1): Ending clean mount
/var/log/messages-20190215:Feb 14 16:37:33 DCK-ZJ-FUD-134 kernel: [    9.925572] nf_conntrack version 0.5.0 (65536 buckets, 262144 max)
/var/log/messages-20190215-Feb 14 16:37:33 DCK-ZJ-FUD-134 kernel: [   10.028952] IPv6: ADDRCONF(NETDEV_UP): docker0: link is not ready
/var/log/messages-20190215-Feb 14 16:37:33 DCK-ZJ-FUD-134 kernel: [   10.141497] overlayfs: upper fs needs to support d_type.
/var/log/messages-20190215-Feb 14 16:37:35 DCK-ZJ-FUD-134 kernel: [   11.605013] overlayfs: upper fs needs to support d_type.

结论

正常机器 nf_conntrack 模块是由启动时自动启动,而异常机器是由 dockerd 调用内核接口加载 nf_conntrack 模块的。

polym commented 5 years ago

换个思路

把正常机器上的 dockerd/kubelet disable 后重启。发现 conntrack 模块也没有被加载。enable 后重启,也出现了相同问题,nf_conntrack 参数设置失败。

所以,之前的结论有误,其实都是由 dockerd 来加载模块的。

polym commented 5 years ago

解决方法

增加 /etc/modules-load.d/nf_conntrack.conf,内容如下:

nf_conntrack
nf_conntrack_ipv4

设置好之后,systemd-sysctl.service 的依赖 systemd-modules-load.service 会自动加载这个配置中的模块。

具体依赖关系可以查看以下两个配置文件