kubernetes-sigs / cluster-api

Home for Cluster API, a subproject of sig-cluster-lifecycle
https://cluster-api.sigs.k8s.io
Apache License 2.0
3.55k stars 1.3k forks source link

0.1.2 controller image fails to run #999

Closed vincepri closed 5 years ago

vincepri commented 5 years ago

/kind bug

What steps did you take and what happened: Run cluster-api-controller 0.1.2, fails with message below.

Warning  Failed     10m (x5 over 11m)   kubelet, cluster-api-control-plane  Error: failed to start container "manager": Error response from daemon: linux spec user: unable to find user nobody: no matching entries in passwd file

What did you expect to happen: Controller running successfully.

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment:

justaugustus commented 5 years ago

+1 from capz...

NAMESPACE               NAME                                               READY   STATUS                 RESTARTS   AGE     IP             NODE                       NOMINATED NODE   READINESS GATES
azure-provider-system   azure-provider-controller-manager-0                1/1     Running                0          2d10h   192.168.21.4   aug-020-1-controlplane-0   <none>           <none>
cluster-api-system      cluster-api-controller-manager-0                   0/1     CreateContainerError   0          2d10h   192.168.21.3   aug-020-1-controlplane-0   <none>           <none>
Events:
  Type     Reason  Age                        From                               Message
  ----     ------  ----                       ----                               -------
  Normal   Pulled  6m18s (x15463 over 2d10h)  kubelet, aug-020-1-controlplane-0  Container image "gcr.io/k8s-cluster-api/cluster-api-controller:0.1.2" already present on machine
  Warning  Failed  87s (x15477 over 2d10h)    kubelet, aug-020-1-controlplane-0  (combined from similar events): Error: failed to create containerd container: mount callback failed on /var/lib/containerd/tmpmounts/containerd-mount172477538: no users found
vincepri commented 5 years ago

It seems that the image published is based on an older distroless version by looking at the base layers:

vincepri commented 5 years ago

/assign @justinsb

vincepri commented 5 years ago

/close

Fixed in 0.1.3

k8s-ci-robot commented 5 years ago

@vincepri: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/cluster-api/issues/999#issuecomment-500660246): >/close > >Fixed in 0.1.3 Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
akutz commented 5 years ago

Are you kidding me!? Talk about luck! I was literally just hitting this with the same problem. Thanks to @vincepri and @justaugustus for solving this for CAPV before we even knew it was an issue!

$ kubectl -n cluster-api-system describe pods
Name:               cluster-api-controller-manager-0
Namespace:          cluster-api-system
Priority:           0
PriorityClassName:  <none>
Node:               kind-control-plane/172.17.0.2
Start Time:         Sat, 15 Jun 2019 23:09:46 -0500
Labels:             control-plane=controller-manager
                    controller-revision-hash=cluster-api-controller-manager-78794b47d5
                    controller-tools.k8s.io=1.0
                    statefulset.kubernetes.io/pod-name=cluster-api-controller-manager-0
Annotations:        <none>
Status:             Pending
IP:                 10.244.0.6
Controlled By:      StatefulSet/cluster-api-controller-manager
Containers:
  manager:
    Container ID:  
    Image:         gcr.io/k8s-cluster-api/cluster-api-controller:0.1.2
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      /manager
    Args:
      -v=6
    State:          Waiting
      Reason:       CreateContainerError
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:     100m
      memory:  30Mi
    Requests:
      cpu:        100m
      memory:     20Mi
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-ccld5 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  default-token-ccld5:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-ccld5
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     CriticalAddonsOnly
                 node-role.kubernetes.io/master:NoSchedule
                 node.alpha.kubernetes.io/notReady:NoExecute
                 node.alpha.kubernetes.io/unreachable:NoExecute
                 node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age               From                         Message
  ----     ------     ----              ----                         -------
  Normal   Scheduled  33s               default-scheduler            Successfully assigned cluster-api-system/cluster-api-controller-manager-0 to kind-control-plane
  Normal   Pulling    32s               kubelet, kind-control-plane  pulling image "gcr.io/k8s-cluster-api/cluster-api-controller:0.1.2"
  Normal   Pulled     28s               kubelet, kind-control-plane  Successfully pulled image "gcr.io/k8s-cluster-api/cluster-api-controller:0.1.2"
  Warning  Failed     28s               kubelet, kind-control-plane  Error: failed to create containerd container: mount callback failed on /var/lib/containerd/tmpmounts/containerd-mount226119184: no users found
  Warning  Failed     28s               kubelet, kind-control-plane  Error: failed to create containerd container: mount callback failed on /var/lib/containerd/tmpmounts/containerd-mount827729977: no users found
  Warning  Failed     14s               kubelet, kind-control-plane  Error: failed to create containerd container: mount callback failed on /var/lib/containerd/tmpmounts/containerd-mount820469373: no users found
  Normal   Pulled     3s (x3 over 28s)  kubelet, kind-control-plane  Container image "gcr.io/k8s-cluster-api/cluster-api-controller:0.1.2" already present on machine
  Warning  Failed     3s                kubelet, kind-control-plane  Error: failed to create containerd container: mount callback failed on /var/lib/containerd/tmpmounts/containerd-mount844997377: no users found
akutz commented 5 years ago

After reading #1000, is there a good reason to use latest for the base image instead of an explicit tag? Yes, one requires updating, but it also provides a deterministic build instead of relying on latest. Plus, explicit tags are just a handy thing to use as they are never subject to pull policies.

lucming commented 2 years ago

+1 from capz...

NAMESPACE               NAME                                               READY   STATUS                 RESTARTS   AGE     IP             NODE                       NOMINATED NODE   READINESS GATES
azure-provider-system   azure-provider-controller-manager-0                1/1     Running                0          2d10h   192.168.21.4   aug-020-1-controlplane-0   <none>           <none>
cluster-api-system      cluster-api-controller-manager-0                   0/1     CreateContainerError   0          2d10h   192.168.21.3   aug-020-1-controlplane-0   <none>           <none>
Events:
  Type     Reason  Age                        From                               Message
  ----     ------  ----                       ----                               -------
  Normal   Pulled  6m18s (x15463 over 2d10h)  kubelet, aug-020-1-controlplane-0  Container image "gcr.io/k8s-cluster-api/cluster-api-controller:0.1.2" already present on machine
  Warning  Failed  87s (x15477 over 2d10h)    kubelet, aug-020-1-controlplane-0  (combined from similar events): Error: failed to create containerd container: mount callback failed on /var/lib/containerd/tmpmounts/containerd-mount172477538: no users found

hi,i meet the same problem, can you tell me how did you solve it in the end.