weaveworks / weave

Simple, resilient multi-host containers networking and more.
https://www.weave.works
Apache License 2.0
6.62k stars 671 forks source link

[kind] existing bridge type "bridge" is different than requested "bridged_fastdp" #3634

Open neolit123 opened 5 years ago

neolit123 commented 5 years ago

in kubeadm testing we are seeing a weird crash of the weave pods after an upgrade from 1.13.4 -> 1.14.1

the same problem is not reproducible locally.

What you expected to happen?

the weave pods to restart after the kubelet is restarted, post cluster upgrade.

What happened?

{"log":"INFO: 2019/04/23 17:27:50.755375 weave  2.5.1\n","stream":"stderr","time":"2019-04-23T17:27:50.755628019Z"}
{"log":"FATA: 2019/04/23 17:27:50.789836 Existing bridge type \"bridge\" is different than requested \"bridged_fastdp\". Please do 'weave reset' and try again\n","stream":"stderr","time":"2019-04-23T17:27:50.790085922Z"}

How to reproduce it?

hard to do that as it only happens in the official k8s CI (prow).

Anything else we need to know?

kubeadm was used to create a cluster using init/join then the cluster is upgraded using kubeadm upgrade apply.

locally this same process works for multiple people, but in CI it fails for unknown reasons.

Versions:

$ weave version

2.5.1

$ docker version
$ uname -a

https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-e2e-kubeadm-kinder-upgrade-stable-master/1120735292630765569/artifacts/docker-info.txt

$ kubectl version

before restart 1.13.4
after 1.14.1

Logs:

$ docker logs weave

or, if using Kubernetes:

$ kubectl logs -n kube-system <weave-net-pod> weave

all the relevant logs are here: https://gcsweb.k8s.io/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-kubeadm-kinder-upgrade-stable-master/1120735292630765569/artifacts/

Network:

$ ip route
$ ip -4 -o addr
$ sudo iptables-save

these commands were not executed in CI, however docker info suggests:

WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
murali-reddy commented 5 years ago

I don't see any reason why the configurations (-no-fastdp, -no-bridged-fastdp) would have changed before and after upgrade. For some reason on cluster with 1.13.4, sleeve mode was selected. Is it possible to get weave-net pod logs before upgrade? I cant find them in the link shared.

bboreham commented 5 years ago

Question: I believe the context of this is "Kind", or Kubernetes in Docker. Weave Net assumes it has control of the complete host - it runs in the host network namespace and pid space, so how does that translate to Kind?

neolit123 commented 5 years ago

@murali-reddy

I don't see any reason why the configurations (-no-fastdp, -no-bridged-fastdp) would have changed before and after upgrade.

true, we do not change that in the CI job in any way.

For some reason on cluster with 1.13.4, sleeve mode was selected. Is it possible to get weave-net pod logs before upgrade? I cant find them in the link shared.

unfortunately, we don't have logs before the Pod restart (that was performed during upgrade).

@bboreham

Question: I believe the context of this is "Kind", or Kubernetes in Docker. Weave Net assumes it has control of the complete host - it runs in the host network namespace and pid space, so how does that translate to Kind?

i'm including @BenTheElder in this conversation as the maintainer of kind and since kind ships with Weave at the moment as the default CNI.

some clarifications about kind: kind creates a cluster where the k8s nodes are Docker containers on the host, but there is also Docker inside these containers that runs the Weave Pods, control-plane Pods, etc. So technically the k8s nodes and Weave pods run on the Docker network. however the Docker network maps to the host network and Weave obviously uses the Kernel assets from the host.

this simple setup can be easily reproduced using: https://github.com/kubernetes-sigs/kind#installation-and-usage

i think the problem here is relevant only to that particular VM that is the host in the CI job. restarting the Weave Pods on that particular host setup breaks, for some reason.

murali-reddy commented 5 years ago

I tried multiple instances of kind clusters, both instances of weave in different cluster used bridged_fastdp, also teardown and recreated kind cluster did not run into issue where weave bridge is setup to bridge mode.

i think the problem here is relevant only to that particular VM that is the host in the CI job.

You mean this happens only on one particular VM ?

If not it would be helpful to have steps to simulate CI job using kind, or someway to reproduce this issue

neolit123 commented 5 years ago

You mean this happens only on one particular VM ?

i also could not reproduce it locally.

How to reproduce it?

hard to do that as it only happens in the official k8s CI (prow).

the prow setup runs VMs inside GCE, that host k8s nodes. the nodes run pods with a container called kubekins that is responsible for each e2e job. the kubekins container creates a kind cluster and the kind node containers have kubeadm and the weave plugin pods inside.

the problem only happens after the upgrade. but other CNI plugins do work, so it seems specific to weave, but we couldn't figure out why.

brightzheng100 commented 4 years ago

I'm seeing exactly the same issue after some rounds of kubeadm init and kubeadm reset -f as I was trying to tune some parameters of kubeadm in a Kubernetes in Docker env -- it's not kind, but I'm using kind-based image.

It was all good first time but eventually I got this:

$ kubectl get pods -n kube-system
NAME                                  READY   STATUS             RESTARTS   AGE
coredns-66bff467f8-k89cc              0/1     Pending            0          16m
coredns-66bff467f8-tpl9g              0/1     Pending            0          16m
etcd-k8s-master0                      1/1     Running            0          17m
etcd-k8s-master1                      1/1     Running            0          13m
etcd-k8s-master2                      1/1     Running            0          13m
kube-apiserver-k8s-master0            1/1     Running            0          17m
kube-apiserver-k8s-master1            1/1     Running            2          13m
kube-apiserver-k8s-master2            1/1     Running            1          13m
kube-controller-manager-k8s-master0   1/1     Running            0          17m
kube-controller-manager-k8s-master1   1/1     Running            0          14m
kube-controller-manager-k8s-master2   1/1     Running            0          13m
kube-proxy-2p99n                      1/1     Running            0          16m
kube-proxy-66lt9                      1/1     Running            0          13m
kube-proxy-bknm8                      1/1     Running            0          14m
kube-proxy-ldktp                      1/1     Running            0          15m
kube-proxy-qxttb                      1/1     Running            0          13m
kube-scheduler-k8s-master0            1/1     Running            0          17m
kube-scheduler-k8s-master1            1/1     Running            0          14m
kube-scheduler-k8s-master2            1/1     Running            0          14m
weave-net-6blhr                       1/2     CrashLoopBackOff   6          8m43s
weave-net-fblhh                       1/2     CrashLoopBackOff   6          8m44s
weave-net-jpjjp                       1/2     CrashLoopBackOff   6          8m45s
weave-net-pf42c                       1/2     CrashLoopBackOff   6          8m44s
weave-net-rh8px                       1/2     CrashLoopBackOff   6          8m45s

$ kubectl -n kube-system logs --all-containers weave-net-6blhr -f
modprobe: module br_netfilter not found in modules.dep
Ignore the error if "br_netfilter" is built-in in the kernel
INFO: 2020/07/02 07:14:15.674608 Starting Weaveworks NPC 2.6.5; node name "k8s-worker0"
INFO: 2020/07/02 07:14:15.674907 Serving /metrics on :6781
Thu Jul  2 07:14:15 2020 <5> ulogd.c:408 registering plugin `NFLOG'
Thu Jul  2 07:14:15 2020 <5> ulogd.c:408 registering plugin `BASE'
modprobe: module xt_set not found in modules.dep
Ignore the error if "xt_set" is built-in in the kernel
cat: can't open '/proc/sys/net/bridge/bridge-nf-call-iptables': No such file or directory
Cannot detect bridge-nf support - network policy and iptables mode kubeproxy may not work reliably
DEBU: 2020/07/02 07:25:11.348727 [kube-peers] Checking peer "6a:85:5d:40:20:4a" against list &{[]}
Peer not in list; removing persisted data
INFO: 2020/07/02 07:25:11.446262 Command line options: map[conn-limit:200 datapath:datapath db-prefix:/weavedb/weave-net docker-api: expect-npc:true host-root:/host http-addr:127.0.0.1:6784 ipalloc-init:consensus=4 ipalloc-range:10.32.0.0/12 metrics-addr:0.0.0.0:6782 name:6a:85:5d:40:20:4a nickname:k8s-worker0 no-dns:true port:6783]
Thu Jul  2 07:14:15 2020 <5> ulogd.c:408 registering plugin `PCAP'
INFO: 2020/07/02 07:25:11.446345 weave  2.6.5
Thu Jul  2 07:14:15 2020 <5> ulogd.c:981 building new pluginstance stack: 'log1:NFLOG,base1:BASE,pcap1:PCAP'
WARNING: scheduler configuration failed: Function not implemented
DEBU: 2020/07/02 07:14:15.732691 Got list of ipsets: [weave-Rzff}h:=]JaaJl/G;(XJpGjZ[ weave-41s)5vQ^o/xWGz6a20N:~?#|E weave-4vtqMI+kx/2]jD%_c0S%thO%V weave-P.B|!ZhkAr5q=XZ?3}tMBA+0 weave-E1ney4o[ojNrLk.6rOHi;7MPE weave-iuZcey(5DeXbzgRFs8Szo]+@p weave-;rGqyMIl1HN^cfDki~Z$3]6!N weave-s_+ChJId4Uy_$}G;WdH|~TK)I weave-k?Z;25^M}|1s7P3|H9i;*;MhG weave-]B*(W?)t*z5O17G044[gUo#$l weave-sui%__gZ}{kX~oZgI_Ttqp=Dp weave-mF}1zpEo4W6iYroE^=:V3{S6W]
DEBU: 2020/07/02 07:14:15.732748 Flushing ipset 'weave-Rzff}h:=]JaaJl/G;(XJpGjZ['
DEBU: 2020/07/02 07:14:15.733784 Flushing ipset 'weave-41s)5vQ^o/xWGz6a20N:~?#|E'
DEBU: 2020/07/02 07:14:15.734851 Flushing ipset 'weave-4vtqMI+kx/2]jD%_c0S%thO%V'
FATA: 2020/07/02 07:25:11.458893 Existing bridge type "bridge" is different than requested "bridged_fastdp". Please do 'weave reset' and try again
...

$ curl -L git.io/weave -o /usr/local/bin/weave
$ chmod a+x /usr/local/bin/weave
$ weave reset
/usr/local/bin/weave: 181: docker: not found
ERROR: Unable to parse docker version
brightzheng100 commented 4 years ago

And then I gave cilium a shot and then all went well -- so I believe it's a potential robustness issue in weave as kubeadm reset -f may not follow exactly the learn up process and then weave would be stuck as some underneath objects had been created and not yet been cleaned up.

disclaimers: it was my first time using cilium and actually I could pick up any of the CNI compliant plugins.

$ kubectl create -f https://raw.githubusercontent.com/cilium/cilium/v1.8/install/kubernetes/quick-install.yaml

$ kubectl -n kube-system get pods
NAME                                  READY   STATUS    RESTARTS   AGE
cilium-4k7hs                          1/1     Running   0          4m58s
cilium-4lf5p                          1/1     Running   0          4m58s
cilium-k4wj9                          1/1     Running   0          4m58s
cilium-n5tm6                          1/1     Running   0          4m58s
cilium-operator-754dd76d85-d8wrf      1/1     Running   0          4m58s
cilium-wr96v                          1/1     Running   1          4m58s
coredns-66bff467f8-k89cc              1/1     Running   0          37m
coredns-66bff467f8-tpl9g              1/1     Running   0          37m
etcd-k8s-master0                      1/1     Running   0          37m
etcd-k8s-master1                      1/1     Running   0          34m
etcd-k8s-master2                      1/1     Running   0          34m
kube-apiserver-k8s-master0            1/1     Running   0          37m
kube-apiserver-k8s-master1            1/1     Running   2          34m
kube-apiserver-k8s-master2            1/1     Running   1          34m
kube-controller-manager-k8s-master0   1/1     Running   0          37m
kube-controller-manager-k8s-master1   1/1     Running   0          35m
kube-controller-manager-k8s-master2   1/1     Running   0          34m
kube-proxy-2p99n                      1/1     Running   0          37m
kube-proxy-66lt9                      1/1     Running   0          34m
kube-proxy-bknm8                      1/1     Running   0          35m
kube-proxy-ldktp                      1/1     Running   0          35m
kube-proxy-qxttb                      1/1     Running   0          34m
kube-scheduler-k8s-master0            1/1     Running   0          37m
kube-scheduler-k8s-master1            1/1     Running   0          35m
kube-scheduler-k8s-master2            1/1     Running   0          34m
neolit123 commented 4 years ago

"kubeadm reset" just cleans up the node state on disk and etcd members in case it's called on control-plane nodes.

BenTheElder commented 4 years ago

unsub. Sorry but we haven't shipped weave by default in a long time for other reasons and I don't really have the bandwidth to support all the vendor CNI implementations in kind myself. If you have a more specific question I'm reachable through our support channels.

bboreham commented 4 years ago

The thing we are missing is how it gets into one state, then somehow gets into another state. So far we only have logs saying that it fails once it is in that second state. On some earlier run there will be a message in the logs saying why it decided to initialize in bridge mode.

Separately, there is more discussion about how kubeadm reset does not clean up the network at #2911

SJrX commented 2 years ago

I'm not sure if this helps but I was having this issue, and then fixed it somehow a few months ago and then it cropped up again, and I forgot how I fixed it and had to spend more time debugging. For background I'm just building a bare metal k8s cluster on 6x raspberry pis running Ubuntu 22.10, as a learning exercise. I didn't have docker installed (only containerd), so I couldn't run weave reset. Using brctl when I deleted the bridge, it came back automagically, but still had the same error. Anyway I noticed that this happened when I didn't have the linux-modules-extra-raspi installed, and digging into the code a bit and figuring out exactly what fastdp is, I noticed that I didn't have the openvswitch kernel module anymore since the kernel upgrade. Installing the right package to get the kernel module back and then nuking the bridge and re-installing it allowed weave to start up again.

Looking at the code a little bit, I do see that there is some code designed to prevent this from happening (or at least that mentions this kernel module).

groundhog2k commented 2 years ago

@SJrX : This hint made my day and saved my pi-cluster after starting node upgrade to 22.x! Thank you!