canonical / microk8s

MicroK8s is a small, fast, single-package Kubernetes for datacenters and the edge.
https://microk8s.io
Apache License 2.0
8.5k stars 772 forks source link

Traceback on `microk8s.enable dns`: `subprocess.CalledProcessError` #4478

Open sed-i opened 7 months ago

sed-i commented 7 months ago

Summary

I am deploying microk8s using cloud-init on an LXD VM.

Here's the relevant section:

      microk8s.enable metrics-server
      microk8s.kubectl rollout status deployments/metrics-server -n kube-system -w --timeout=600s

      microk8s.enable dns
      microk8s.kubectl rollout status deployments/coredns -n kube-system -w --timeout=600s

      microk8s.enable rbac

And the output in /var/log/cloud-init-output.log:

Infer repository core for addon metrics-server
Enabling Metrics-Server
serviceaccount/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
service/metrics-server created
deployment.apps/metrics-server created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
clusterrolebinding.rbac.authorization.k8s.io/microk8s-admin created
Metrics-Server is enabled
Waiting for deployment "metrics-server" rollout to finish: 0 of 1 updated replicas are available...
deployment "metrics-server" successfully rolled out
Traceback (most recent call last):
  File "/snap/microk8s/6532/scripts/wrappers/enable.py", line 41, in <module>
    enable(prog_name="microk8s enable")
  File "/snap/microk8s/6532/usr/lib/python3/dist-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/snap/microk8s/6532/usr/lib/python3/dist-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/snap/microk8s/6532/usr/lib/python3/dist-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/snap/microk8s/6532/usr/lib/python3/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/snap/microk8s/6532/scripts/wrappers/enable.py", line 37, in enable
    xable("enable", addons)
  File "/snap/microk8s/6532/scripts/wrappers/common/utils.py", line 470, in xable
    protected_xable(action, addon_args)
  File "/snap/microk8s/6532/scripts/wrappers/common/utils.py", line 498, in protected_xable
    unprotected_xable(action, addon_args)
  File "/snap/microk8s/6532/scripts/wrappers/common/utils.py", line 514, in unprotected_xable
    enabled_addons_info, disabled_addons_info = get_status(available_addons_info, True)
  File "/snap/microk8s/6532/scripts/wrappers/common/utils.py", line 566, in get_status
    kube_output = kubectl_get("all,ingress")
  File "/snap/microk8s/6532/scripts/wrappers/common/utils.py", line 248, in kubectl_get
    return run(KUBECTL, "get", cmd, "--all-namespaces", die=False)
  File "/snap/microk8s/6532/scripts/wrappers/common/utils.py", line 69, in run
    result.check_returncode()
  File "/snap/microk8s/6532/usr/lib/python3.8/subprocess.py", line 448, in check_returncode
    raise CalledProcessError(self.returncode, self.args, self.stdout,
subprocess.CalledProcessError: Command '('/snap/microk8s/6532/microk8s-kubectl.wrapper', 'get', 'all,ingress', '--all-namespaces')' returned non-zero exit status 1.
The connection to the server 127.0.0.1:16443 was refused - did you specify the right host or port?
Infer repository core for addon rbac
Enabling RBAC
Reconfiguring apiserver
Restarting apiserver
RBAC is enabled

What Should Happen Instead?

I imagine there should be no traceback in this case, even upon such failure.

Reproduction Steps

I cannot reproduce this consistently, only seen it once so far after ~10 successful deployments.

Run lxc launch ubuntu:22.04 cos-lite --vm < cos-lite.yaml with this file:

config:
  cloud-init.user-data: |
    #cloud-config
    package_update: false
    package_upgrade: false
    package_reboot_if_required: false

    packages:
    - jq

    snap:
      commands:
      - snap install juju --channel=3.1/stable
      - snap install microk8s --channel=1.28-strict/stable
      - snap alias microk8s.kubectl kubectl
      - snap alias microk8s.kubectl k
      - snap install yq
      - snap refresh

    runcmd:
    - |
      # Make sure juju directory is there
      # https://bugs.launchpad.net/juju/+bug/1995697
      mkdir -p /root/.local/share/juju

    - |
      # setup microk8s and bootstrap
      usermod -a -G snap_microk8s root
      microk8s status --wait-ready

      microk8s.enable metrics-server
      microk8s.kubectl rollout status deployments/metrics-server -n kube-system -w --timeout=600s

      microk8s.enable dns
      microk8s.kubectl rollout status deployments/coredns -n kube-system -w --timeout=600s

      microk8s.enable rbac

      # wait for storage become available
      microk8s.enable hostpath-storage
      microk8s.kubectl rollout status deployments/hostpath-provisioner -n kube-system -w --timeout=600s

      # MetalLB
      IPADDR=$(ip -4 -j route get 2.2.2.2 | jq -r '.[] | .prefsrc')
      microk8s enable metallb:$IPADDR-$IPADDR
      microk8s.kubectl rollout status daemonset.apps/speaker -n metallb-system -w --timeout=600s

Introspection Report

inspection-report-20240402_194215.tar.gz

neoaggelos commented 7 months ago

Hi @sed-i

You could try adding a microk8s status --wait-ready before the microk8s enable dns.

Alternatively, the following is also known to work and might be faster:

while ! microk8s kubectl get --raw /readyz; do
  echo Waiting for kube-apiserver
  sleep 3
done