Open vernimmen-textkernel opened 4 years ago
The problem reoccurred this time after 3 days. I've now enabled debug logging to get more information when it happens again.
And happened again this morning. The debug log for the broken pod (and some other details) is here: https://gist.github.com/vernimmen-textkernel/7b99aa7c076b4458684669dea4092c3f
And another one: https://gist.github.com/vernimmen-textkernel/a8e3959f2c856ca9519c05640eba7ab0 I have now applied the automatic weave pod restart when there are too many sleave connections as mentioned in issue 3773, so we probably won't notice more of this problem. I hope the above logs and debug logs are enough to find the cause of the problem.
Hi @vernimmen-textkernel
Where you get a message like this:
INFO: 2020/06/25 22:51:21.442962 ->[10.30.12.2:6783|fa:b4:e9:63:60:33(kubem-02.p.nl01)]: connection shutting down due to error: read tcp4 10.30.12.6:46468->10.30.12.2:6783: read: connection reset by peer
we need the logs of the other side, to see why it dropped the connection.
Could you please do weave status connections
at the time of the outage to show what is and isn't working.
There are no errors or connection drops in the other two gists. (Generally we can see what happened from INFO logs and don't need DEBUG)
I have thought that my networks problems are related to this issue.
In my case: Weave had never uses the sleeve mode. After investigating a lot of issues on github I have found the solution for my problem:
Connection between nodes was cancalled and broken because the iptables rules were incorrect. It took me a lot of time to solve and understand this.
My kube-proxy.yaml not contains the xtables.lock-File.
So, weave-net uses the /run/xtables.lock
file but kube-proxy not. So the two applications had a kind of race conditions while manipulating the iptables rules....
Just checked. In my case (KOPS managed cluster) kube-proxy manifest have xtables.lock-File mount. Same for weave. @nesc58 Can you show pls exact part you missed?
for my case if I would be able to check connectivity (from the weave pod) to the cluster services I would be able to setup liveness check. The tricky part for me now - how to do this with weave since it's privileged pod with host networking. and I should do such checks over weave provided layer. Anyone have ideas?
Hi, apologies for the late reply.
The weave daemonset contains the following mounts. I have removed a lot of other lines (metadata, hostnetwork and so on).
containers:
- name: weave
image: weaveworks/weave-kube:2.6.5
...
volumeMounts:
...
- name: xtables-lock
mountPath: /run/xtables.lock
readOnly: false
- name: weave-npc
image: weaveworks/weave-npc:2.6.5
...
volumeMounts:
- name: xtables-lock
mountPath: /run/xtables.lock
readOnly: false
volumes:
...
- name: xtables-lock
hostPath:
path: /run/xtables.lock
type: FileOrCreate
These mounts must also available to the kube-proxy containers. (for me located at /etc/kubernetes/manifests/kube-proxy.yaml
, I dont know were the static pod files stored on your system)
apiVersion: v1
kind: Pod
...
spec:
hostNetwork: true
containers:
- name: kube-proxy
image: gcr.io/google-containers/kube-proxy-amd64:v1.18.6
command:
- kube-proxy
- --config=/var/lib/kubelet/kube-proxy.config
securityContext:
privileged: true
volumeMounts:
...
- mountPath: /run/xtables.lock
name: iptableslock
readOnly: false
volumes:
...
- hostPath:
path: /run/xtables.lock
type: FileOrCreate
name: iptableslock
The mounts for kube-proxy
missed on my system. When these mounts already set on your system, iptables modifications should work fine.
We had a lot of configuration issues. These issues resulted in a unstable running cluster. So I think that our problems differs a lot (i think).
Maybe kops
created all parameters correctly and there are no manually fixes needed.
Here's what configurations I changed:
We use debian linux distribution. So cgroups are managed by systemd by default. On other distributions the work is done by cgroupfs.
So I had to change the docker daemon to use systemd as cgroup driver. (default is cgroupfs). Therefore the configuration of kubelet must also changed to use the systemd cgroup driver. (https://github.com/kubernetes/kubeadm/issues/1394)
After that I had to change the system.conf. I created a system.conf
file /etc/systemd/system.conf.d/accounting.conf
with the following content to ensure that cpu, memory and block-io accounting is enabled by default.
[Manager]
DefaultCPUAccounting=yes
DefaultMemoryAccounting=yes
DefaultBlockIOAccounting=yes
Next, there seems to be a bug in cgroup handling when a lot of containers started/recreated. In this case we had to change some kernel parameters.
I added cgroup_enable=memory cgroup.memory=nokmem
to the GRUB_CMDLINE_LINUX (grub is the default bootloader for debian systems). (https://github.com/kubernetes/kubernetes/issues/70324#issuecomment-433612120 -> referenced issues)
On our clusters there were no crashed since these changes. I would say: The fix works for me but I can't say this is the universal solution for all problems. I'm unable to figure out all side effects of these changes.
I am sorry that i cannot help you furthermore.
Edit: Today Debian 10.5 released with new kernel version with many fixes backported. e.g. a lot of memory fixes (mm/slub, cgroups and so on). I hope this will solve a lot of issues and memory leaks.
What you expected to happen?
We expect communication between pods on different kubernetes nodes not to break
What happened?
Symptoms: pods on 1 kubernetes worker node stop being able to communicate with pods on other worker nodes. All other worker nodes remain fine. To work around the problem, we delete the weave pod on the affected worker node. Once the new pod is up, everything returns to normal. After a while (anywhere between 24 and 96 hours) the problem happens again. Sometimes with the same worker node, sometimes with a different worker node. When looking at the connections, some or all connections are using sleeve instead of fastdp.
How to reproduce it?
It is happening about 1x per 48 hours for us currently. We do not yet have a way to trigger the problem To try and trigger the problem we disconnected the network on one of the worker nodes for a few seconds, but that did not do anything.
Anything else we need to know?
created by kubespray 2.11 This runs in VMs on 3 hypervisors on-prem.
In my eyes the symptoms of this issue resembles https://github.com/weaveworks/weave/issues/3641 and https://github.com/weaveworks/weave/issues/3773
Versions:
Logs:
from this moment the communication problem started:
full logs of that worker node's weave pod are in https://gist.github.com/vernimmen-textkernel/110a8219a7ea33eeeea3997adf18bf6c