aledbf / kube-keepalived-vip

Kubernetes Virtual IP address/es using keepalived
Apache License 2.0
188 stars 75 forks source link

Disable net/ipv4/vs/conn_reuse_mode to improve IPVS performance #109

Closed panpan0000 closed 4 years ago

panpan0000 commented 5 years ago

This is to address the performance issue. when stressing the VIP:port , the response latency will become 1 second long after a while, and looping. below is the jMeter response time diagram, you will see the response time become 1s and last ~30 s. (using ab to stress will see the same thing) image

This is a known situation in kubernetes community, resolution is to disabling net/ipv4/vs/conn_reuse_mode, which kube-proxy does the same way in IPVS mode.

reference: https://github.com/kubernetes/kubernetes/issues/70747 https://github.com/cloudnativelabs/kube-router/issues/544

CLAassistant commented 5 years ago

CLA assistant check
All committers have signed the CLA.

coveralls commented 5 years ago

Coverage Status

Coverage remained the same at 15.014% when pulling 25d344cb598311cbff7a80afbfe6cf900b5aa99c on panpan0000:conn_reuse_mode_0 into 1c766b1dbfb0fba3b599aee19a0cd4cd827bdcbd on aledbf:master.

yyx commented 4 years ago

Hello everyone: We are very fortunate to tell you that this bug has been fixed by us and has been verified to work very well. The patch(ipvs: avoid drop first packet by reusing conntrack) is being submitted to the Linux kernel community. You can also apply this patch to your own kernel, and then only need to set net.ipv4.vs.conn_reuse_mode=1(default) and net.ipv4.vs.conn_reuse_old_conntrack=1(default). As the net.ipv4.vs.conn_reuse_old_conntrack sysctl switch is newly added. You can adapt the kube-proxy by judging whether there is net.ipv4.vs.conn_reuse_old_conntrack, if so, it means that the current kernel is the version that fixed this bug. That Can solve the following problems:

  1. Rolling update, IPVS keeps scheduling traffic to the destroyed Pod
  2. Unbalanced IPVS traffic scheduling after scaled up or rolling update
  3. fix IPVS low throughput issue #71114 https://github.com/kubernetes/kubernetes/pull/71114
  4. One second connection delay in masque https://marc.info/?t=151683118100004&r=1&w=2
  5. IPVS low throughput #70747 https://github.com/kubernetes/kubernetes/issues/70747
  6. Apache Bench can fill up ipvs service proxy in seconds #544 https://github.com/cloudnativelabs/kube-router/issues/544
  7. Additional 1s latency in host -> service IP -> pod when upgrading from 1.15.3 -> 1.18.1 on RHEL 8.1 #90854 https://github.com/kubernetes/kubernetes/issues/90854
  8. kube-proxy ipvs conn_reuse_mode setting causes errors with high load from single client #81775 https://github.com/kubernetes/kubernetes/issues/81775

    Thank you. By Yang Yuxi (TencentCloudContainerTeam)

andrewsykim commented 4 years ago

Following-up on @yyx's comment above for posterity.

The above patch mentioned in https://github.com/aledbf/kube-keepalived-vip/pull/109#issuecomment-642705904 didn't make it to the kernel but there are two recently merged patches worth highlighting. One of them fixes the 1 second delay issue when a conntrack entry is reused and the other fixes an issue where packets are dropped when stale connection entries in the IPVS table are used: 1) http://patchwork.ozlabs.org/project/netfilter-devel/patch/20200701151719.4751-1-ja@ssi.bg/ 2) http://patchwork.ozlabs.org/project/netfilter-devel/patch/20200708161638.13584-1-kim.andrewsy@gmail.com/