flannel-io / flannel

flannel is a network fabric for containers, designed for Kubernetes
Apache License 2.0
8.6k stars 2.87k forks source link

troubleshooting.md: add `ethtool -K flannel.1 tx-checksum-ip-generic off` for NAT #1929

Closed AkihiroSuda closed 2 months ago

AkihiroSuda commented 2 months ago

Description

When the public IP is behind NAT, the UDP checksum fields of the VXLAN packets can be corrupted. In that case, try running the following commands to avoid corrupted checksums:

/usr/sbin/ethtool -K flannel.1 tx-checksum-ip-generic off

To automate the command above via udev, create /etc/udev/rules.d/90-flannel.rules as follows:

SUBSYSTEM=="net", ACTION=="add|change|move", ENV{INTERFACE}=="flannel.1", RUN+="/usr/sbin/ethtool -K flannel.1 tx-checksum-ip-generic off"

ref:

Todos

(None)

Release Note

None required
manuelbuil commented 2 months ago

Hey, thanks for the PR. This is a workaround for a bug in some kernels or? If we do this, we would be creating a performance penalty in the kernels which fixed this, or?

AkihiroSuda commented 2 months ago

Hey, thanks for the PR. This is a workaround for a bug in some kernels or? If we do this, we would be creating a performance penalty in the kernels which fixed this, or?

I'm not sure if this is a bug or a designed behavior on the kernel's side, but I guess the behavior may potentially change in a future version of kernel, perhaps with some sysctl.

So I added the command only in troubleshooting.md.

manuelbuil commented 2 months ago

Hey, thanks for the PR. This is a workaround for a bug in some kernels or? If we do this, we would be creating a performance penalty in the kernels which fixed this, or?

I'm not sure if this is a bug or a designed behavior on the kernel's side, but I guess the behavior may potentially change in a future version of kernel, perhaps with some sysctl.

So I added the command only in troubleshooting.md.

Yes, it is supposed to be a kernel problem: https://github.com/kubernetes/kubernetes/issues/88986#issuecomment-635367089 but TBH, I haven't seen a real fix in those versions stated in the comment

thomasferrandiz commented 2 months ago

@AkihiroSuda can you please fix the merge conflict ?

AkihiroSuda commented 2 months ago

@AkihiroSuda can you please fix the merge conflict ?

done