k0sproject / k0smotron

k0smotron
https://docs.k0smotron.io/
Other
472 stars 45 forks source link

k0smotron operator falls into CrashLoopBackOff #659

Closed aerfio closed 1 month ago

aerfio commented 2 months ago

Reproduction steps:

  1. Create an empty kind cluster (or use any other tool):
    kind create cluster --wait=5m --name k0smotron
  2. Install k0smotron following official instruction: https://docs.k0smotron.io/v1.0.2/install/
    kubectl apply --server-side=true -f https://docs.k0smotron.io/v1.0.2/install.yaml
  3. Wait several minutes and see k0smotron crashing:
    ➜ kubectl get pod -n k0smotron
    NAME                                            READY   STATUS    RESTARTS      AGE
    k0smotron-controller-manager-64f9c7dc58-dcpc9   2/2     Running   3 (56s ago)   7m54s

The pod logs:

2024-07-29T13:49:43Z    ERROR   controller-runtime.source.EventHandler  failed to get informer from cache       {"error": "failed to get API group resources: unable to retrieve the complete list of server APIs: cluster.x-k8s.io/v1beta1: the server could not find the requested resource"}
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1.1
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/source/kind.go:68
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func2
        /go/pkg/mod/k8s.io/apimachinery@v0.28.4/pkg/util/wait/loop.go:73
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext
        /go/pkg/mod/k8s.io/apimachinery@v0.28.4/pkg/util/wait/loop.go:74
k8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel
        /go/pkg/mod/k8s.io/apimachinery@v0.28.4/pkg/util/wait/poll.go:33
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/source/kind.go:56
2024-07-29T13:49:53Z    ERROR   Could not wait for Cache to sync        {"controller": "machine", "controllerGroup": "cluster.x-k8s.io", "controllerKind": "Machine", "error": "failed to wait for machine caches to sync: timed out waiting for cache to be synced for Kind *v1beta1.Machine"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.1
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/controller/controller.go:203
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/controller/controller.go:208
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/controller/controller.go:234
sigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile.func1
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/manager/runnable_group.go:223
2024-07-29T13:49:53Z    INFO    Stopping and waiting for non leader election runnables
2024-07-29T13:49:53Z    INFO    Stopping and waiting for leader election runnables
2024-07-29T13:49:53Z    INFO    Shutdown signal received, waiting for all workers to finish     {"controller": "k0smotroncontrolplane", "controllerGroup": "controlplane.cluster.x-k8s.io", "controllerKind": "K0smotronControlPlane"}
2024-07-29T13:49:53Z    INFO    Shutdown signal received, waiting for all workers to finish     {"controller": "remotecluster", "controllerGroup": "infrastructure.cluster.x-k8s.io", "controllerKind": "RemoteCluster"}
2024-07-29T13:49:53Z    INFO    Shutdown signal received, waiting for all workers to finish     {"controller": "remotemachine", "controllerGroup": "infrastructure.cluster.x-k8s.io", "controllerKind": "RemoteMachine"}
2024-07-29T13:49:53Z    INFO    Shutdown signal received, waiting for all workers to finish     {"controller": "k0scontrolplane", "controllerGroup": "controlplane.cluster.x-k8s.io", "controllerKind": "K0sControlPlane"}
2024-07-29T13:49:53Z    INFO    Shutdown signal received, waiting for all workers to finish     {"controller": "jointokenrequest", "controllerGroup": "k0smotron.io", "controllerKind": "JoinTokenRequest"}
2024-07-29T13:49:53Z    INFO    Shutdown signal received, waiting for all workers to finish     {"controller": "k0scontrollerconfig", "controllerGroup": "bootstrap.cluster.x-k8s.io", "controllerKind": "K0sControllerConfig"}
2024-07-29T13:49:53Z    INFO    Shutdown signal received, waiting for all workers to finish     {"controller": "k0sworkerconfig", "controllerGroup": "bootstrap.cluster.x-k8s.io", "controllerKind": "K0sWorkerConfig"}
2024-07-29T13:49:53Z    INFO    Shutdown signal received, waiting for all workers to finish     {"controller": "cluster", "controllerGroup": "k0smotron.io", "controllerKind": "Cluster"}
2024-07-29T13:49:53Z    INFO    All workers finished    {"controller": "remotemachine", "controllerGroup": "infrastructure.cluster.x-k8s.io", "controllerKind": "RemoteMachine"}
2024-07-29T13:49:53Z    INFO    All workers finished    {"controller": "k0smotroncontrolplane", "controllerGroup": "controlplane.cluster.x-k8s.io", "controllerKind": "K0smotronControlPlane"}
2024-07-29T13:49:53Z    INFO    All workers finished    {"controller": "remotecluster", "controllerGroup": "infrastructure.cluster.x-k8s.io", "controllerKind": "RemoteCluster"}
2024-07-29T13:49:53Z    INFO    All workers finished    {"controller": "k0scontrolplane", "controllerGroup": "controlplane.cluster.x-k8s.io", "controllerKind": "K0sControlPlane"}
2024-07-29T13:49:53Z    INFO    All workers finished    {"controller": "k0scontrollerconfig", "controllerGroup": "bootstrap.cluster.x-k8s.io", "controllerKind": "K0sControllerConfig"}
2024-07-29T13:49:53Z    INFO    All workers finished    {"controller": "jointokenrequest", "controllerGroup": "k0smotron.io", "controllerKind": "JoinTokenRequest"}
2024-07-29T13:49:53Z    INFO    All workers finished    {"controller": "k0sworkerconfig", "controllerGroup": "bootstrap.cluster.x-k8s.io", "controllerKind": "K0sWorkerConfig"}
2024-07-29T13:49:53Z    INFO    All workers finished    {"controller": "cluster", "controllerGroup": "k0smotron.io", "controllerKind": "Cluster"}
2024-07-29T13:49:53Z    INFO    Stopping and waiting for caches
2024-07-29T13:49:53Z    INFO    Stopping and waiting for webhooks
2024-07-29T13:49:53Z    INFO    Stopping and waiting for HTTP servers
2024-07-29T13:49:53Z    INFO    controller-runtime.metrics      Shutting down metrics server with timeout of 1 minute
2024-07-29T13:49:53Z    INFO    shutting down server    {"kind": "health probe", "addr": "[::]:8081"}
2024-07-29T13:49:53Z    INFO    Wait completed, proceeding to shutdown the manager
2024-07-29T13:49:53Z    ERROR   setup   problem running manager {"error": "failed to wait for machine caches to sync: timed out waiting for cache to be synced for Kind *v1beta1.Machine"}
main.main
        /workspace/cmd/main.go:247
runtime.main
        /usr/local/go/src/runtime/proc.go:267
makhov commented 1 month ago

Thanks for the report. We fixed the issue and published k0smotron v1.0.3 with the fix