charmed-kubernetes / kubernetes-docs

This repository contains the development version of docs for Charmed Kubernetes
7 stars 52 forks source link

Charmed Kubernetes calico workers perpetually blocked #794

Closed flakrat closed 1 year ago

flakrat commented 1 year ago

I have followed the install instructions for Charmed Kubernetes Installing to a local machine multiple times, each time I end up with juju status showing all 5 calico workers as perpetually blocked.

I'm not sure if this is due to something missing from the Installing to a local machine doc, or an issue with Charmed Kubernetes itself

Model  Controller           Cloud/Region         Version  SLA          Timestamp
k8s    localhost-localhost  localhost/localhost  3.1.5    unsupported  12:04:13Z

App                       Version  Status   Scale  Charm                     Channel  Rev  Exposed  Message
calico                             blocked      5  calico                    stable    95  no       ignore-loose-rpf config is in conflict with rp_filter value
containerd                1.6.8    active       5  containerd                stable    69  no       Container runtime available
easyrsa                   3.0.1    active       1  easyrsa                   stable    48  no       Certificate Authority connected.
etcd                      3.4.22   active       3  etcd                      stable   748  no       Healthy with 3 known peers
kubeapi-load-balancer     1.18.0   active       1  kubeapi-load-balancer     stable    84  yes      Loadbalancer ready.
kubernetes-control-plane  1.28.1   waiting      2  kubernetes-control-plane  stable   302  no       Waiting for 9 kube-system pods to start
kubernetes-worker         1.28.1   active       3  kubernetes-worker         stable   123  yes      Kubernetes worker running.

Unit                         Workload  Agent  Machine  Public address  Ports         Message
easyrsa/0*                   active    idle   0        10.3.170.184                  Certificate Authority connected.
etcd/0                       active    idle   1        10.3.170.116    2379/tcp      Healthy with 3 known peers
etcd/1                       active    idle   2        10.3.170.34     2379/tcp      Healthy with 3 known peers
etcd/2*                      active    idle   3        10.3.170.94     2379/tcp      Healthy with 3 known peers
kubeapi-load-balancer/0*     active    idle   4        10.3.170.200    443,6443/tcp  Loadbalancer ready.
kubernetes-control-plane/0   waiting   idle   5        10.3.170.7      6443/tcp      Waiting for 9 kube-system pods to start
  calico/4                   blocked   idle            10.3.170.7                    ignore-loose-rpf config is in conflict with rp_filter value
  containerd/4               active    idle            10.3.170.7                    Container runtime available
kubernetes-control-plane/1*  waiting   idle   6        10.3.170.96     6443/tcp      Waiting for 9 kube-system pods to start
  calico/3                   blocked   idle            10.3.170.96                   ignore-loose-rpf config is in conflict with rp_filter value
  containerd/3               active    idle            10.3.170.96                   Container runtime available
kubernetes-worker/0          active    idle   7        10.3.170.77     80,443/tcp    Kubernetes worker running.
  calico/1                   blocked   idle            10.3.170.77                   ignore-loose-rpf config is in conflict with rp_filter value
  containerd/1               active    idle            10.3.170.77                   Container runtime available
kubernetes-worker/1*         active    idle   8        10.3.170.48     80,443/tcp    Kubernetes worker running.
  calico/2                   blocked   idle            10.3.170.48                   ignore-loose-rpf config is in conflict with rp_filter value
  containerd/2               active    idle            10.3.170.48                   Container runtime available
kubernetes-worker/2          active    idle   9        10.3.170.213    80,443/tcp    Kubernetes worker running.
  calico/0*                  blocked   idle            10.3.170.213                  ignore-loose-rpf config is in conflict with rp_filter value
  containerd/0*              active    idle            10.3.170.213                  Container runtime available

Machine  State    Address       Inst id        Base          AZ  Message
0        started  10.3.170.184  juju-c2f4db-0  ubuntu@22.04      Running
1        started  10.3.170.116  juju-c2f4db-1  ubuntu@22.04      Running
2        started  10.3.170.34   juju-c2f4db-2  ubuntu@22.04      Running
3        started  10.3.170.94   juju-c2f4db-3  ubuntu@22.04      Running
4        started  10.3.170.200  juju-c2f4db-4  ubuntu@22.04      Running
5        started  10.3.170.7    juju-c2f4db-5  ubuntu@22.04      Running
6        started  10.3.170.96   juju-c2f4db-6  ubuntu@22.04      Running
7        started  10.3.170.77   juju-c2f4db-7  ubuntu@22.04      Running
8        started  10.3.170.48   juju-c2f4db-8  ubuntu@22.04      Running
9        started  10.3.170.213  juju-c2f4db-9  ubuntu@22.04      Running

I have tried this on a bare metal install of Ubuntu 22.04 LTS on a Dell R640 x86_64 server and within a VM provisioned with multipass launch -n tutorial-vm -m 32g -c 16 -d 80G jammy

evilnick commented 1 year ago

Hi. Thanks for the report and apologies for the difficulty you are having. I'm going to take a look and follow up

evilnick commented 1 year ago

Just on the offchance, there have been updates to Calico and some other CNI charms in the latest release. Could you try running:

juju config calico ignore-loose-rpf=true

That may solve the issue, will investigate if that should be the default when using LXD/VMs and update the docs if so

evilnick commented 1 year ago

Further update - my bad - we actually have a docs update for that at the moment, it was just waiting for me to merge and release - #789

I hope this resolves your issue - if not feel free to re-open!

flakrat commented 12 months ago

@evilnick Thanks, that worked.