MusicDin / kubitect

Kubitect provides a simple way to set up a highly available Kubernetes cluster across multiple hosts.
https://kubitect.io
Apache License 2.0
145 stars 35 forks source link

metallb permissions issue #159

Open bsdlp opened 1 year ago

bsdlp commented 1 year ago

having some trouble where ingress-nginx's external ip gets stuck in pending. did some digging and it seems like metallb isn't set up correctly by kubespray. may be related to permissions/dns - hoping i could get some assistance in digging further.

edit looking into it some more, this exists with many other pods too - coredns. seems like a systemic issue with rbac?

> kubectl describe pods -n metallb-system controller-545fcbb979-lf78d
...
  Warning  FailedMount  116s   kubelet            MountVolume.SetUp failed for volume "kube-api-access-8ftrn" : failed to fetch token: serviceaccounts "controller" is forbidden: User "system:node:pounce-worker-1" cannot create resource "serviceaccounts/token" in API group "" in the namespace "metallb-system": no relationship found between node 'pounce-worker-1' and this object

i deleted the logs for the metallb speakers already but im spinning up a new cluster to get new logs, the speakers said something about "no ips available" which is suspect

from coredns pod events:

  Warning  FailedMount  54m (x2 over 54m)  kubelet            MountVolume.SetUp failed for volume "kube-api-access-58lc6" : failed to fetch token: Post "https://192.168.200.212:6443/api/v1/namespaces/kube-system/serviceaccounts/coredns/token": EOF

if i manually curl that url i get:

{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "serviceaccounts \"coredns\" is forbidden: User \"system:anonymous\" cannot get resource \"serviceaccounts/token\" in API group \"\" in the namespace \"kube-system\"",
  "reason": "Forbidden",
  "details": {
    "name": "coredns",
    "kind": "serviceaccounts"
  },
  "code": 403
}%                                                                                                                              

kubitect config:

hosts:
  - name: catnet21390176
    connection:
      ip: 192.168.51.117
      type: remote
      user: root
      ssh:
        keyfile: "~/.ssh/id_ed25519"
  - name: catnet21390118
    connection:
      ip: 192.168.51.118
      type: remote
      user: root
      ssh:
        keyfile: "~/.ssh/id_ed25519"

cluster:
  name: pounce
  network:
    mode: bridge
    cidr: 192.168.0.0/16
    bridge: br0
  nodeTemplate:
    user: jchen
    updateOnBoot: true
    os:
      distro: ubuntu22
      networkInterface: ens3
    dns:
      - 192.168.1.1
  nodes:
    loadBalancer:
      vip: 192.168.200.212
      instances:
        - id: 1
          ram: 4
          cpu: 4
          mainDiskSize: 16
          host: catnet21390118
      forwardPorts:
        - name: http
          port: 80
        - name: https
          port: 443
          target: all
    master:
      default:
        ram: 8
        cpu: 4
        mainDiskSize: 32
      instances:
        - id: 1
          host: catnet21390176
    worker:
      default:
        ram: 48
        cpu: 8
        mainDiskSize: 512
      instances:
        - id: 1
          host: catnet21390176
        - id: 2
          host: catnet21390118
          ram: 56
          mainDiskSize: 768
          cpu: 8
kubernetes:
  version: v1.26.5
  networkPlugin: calico

addons:
  kubespray:
    ingress_nginx_enabled: true
    ingress_nginx_namespace: "ingress-nginx"
    ingress_nginx_insecure_port: 80
    ingress_nginx_secure_port: 443
    metallb_enabled: true
    metallb_speaker_enabled: true
    metallb_ip_range:
      - 192.168.201.0/24
    metallb_auto_assign: true
    metallb_version: v0.13.10
    metallb_protocol: "layer2"
    helm_enabled: true
    dashboard_enabled: true
    cert_manager_enabled: true
    cert_manager_namespace: "cert-manager"
bsdlp commented 1 year ago

doing some more digging and many redeployments later it seems like this issue is specifically tied to RBAC, and how kubespray's addons are configured. If I don't include any addons in the kubitect config and install them after the cluster is set up using i.e.

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.8.1/deploy/static/provider/baremetal/deploy.yaml

and for metallb etc

then it seems like it works fine. i would love to be able to have them all come up configured by kubespray/kubitect tho so any guidance would be appreciated

MusicDin commented 1 year ago

This seems like a bug. I was able to reproduce the issue, so I'll investigate what is going on.