canonical / microk8s

MicroK8s is a small, fast, single-package Kubernetes for datacenters and the edge.
https://microk8s.io
Apache License 2.0
8.51k stars 772 forks source link

while applying new deployment, microk8s showing container creating for an extended period #4293

Open nibrasmuhammed opened 1 year ago

nibrasmuhammed commented 1 year ago

Summary

running microk8s long time makes this issue, while applying new deployment, microk8s showing container creating for an extended period. it will not create the container. also, if we try to delete, status will change to pending(for deletion)

What Should Happen Instead?

microk8s is supposed to create the container

Reproduction Steps

  1. run k8s with some deployments long time
  2. try to apply new deployment

Introspection Report

Can you suggest a fix?

it was a problem calico controller and calico node. when I descibe the pod, it shows me this. I tried deleting calico controller and calico node, once k8s starts new calico node and calico controller, the deployment container I applied will be created and running.

Are you interested in contributing with a fix?

no

neoaggelos commented 1 year ago

Hi @nibrasmuhammed

Can you share an inspection tarball from the cluster where you are having an issue? Can you create a pod that gets stuck in "Creating" status, then share the output of kubectl describe pod $podname?

Thanks!

nibrasmuhammed commented 1 year ago

Hi @neoaggelos

here is the output of the command

Name:             admin-service-deployment-8b4cf579c-m98v8
Namespace:        default
Priority:         0
Service Account:  default
Node:             microk8s-vm/192.168.67.3
Start Time:       Mon, 13 Nov 2023 12:46:22 +0530
Labels:           app=admin-service
                  pod-template-hash=8b4cf579c
Annotations:      <none>
Status:           Pending
IP:               
IPs:              <none>
Controlled By:    ReplicaSet/admin-service-deployment-8b4cf579c
Containers:
  admin-service-container:
    Container ID:  
    Image:         admin-service:local
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      /adminService
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment:
      ADMIN_SERVICE_PORT:  80
      LOG_FILE_NAME:       /tmp/usage.log
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-rdw6f (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  kube-api-access-rdw6f:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age                  From               Message
  ----     ------                  ----                 ----               -------
  Normal   Scheduled               3m47s                default-scheduler  Successfully assigned default/admin-service-deployment-8b4cf579c-m98v8 to microk8s-vm
  Warning  FailedCreatePodSandBox  3m47s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "e22b275a8d898409dc837f586c3faf66d3506102d12d26d6dbf60cdc831044a6": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
  Normal   SandboxChanged          0s (x18 over 3m46s)  kubelet            Pod sandbox changed, it will be killed and re-created.
neoaggelos commented 1 year ago

Hi @nibrasmuhammed

From the events of the pod, it looks like Calico has not yet started and configured itself on the node. What's the status of microk8s kubectl get pod -A? Can you also share an inspection tarball (microk8s inspect) to see if something is wrong, or simply Calico takes some time before it starts?

Thanks!

nibrasmuhammed commented 1 year ago

Hi @neoaggelos

both calico-kube-controllers and calico-node are running fine. once I got this error, I killed both calico-node and calico-kube-controllers and it worked as expected.

if we have not deleted both calico's, the container I am trying to create will not get created forever.

neoaggelos commented 1 year ago

Unfortunately I cannot help further without some more logs from the services themselves. Can you share an inspection report? There should be something related in the logs of containerd, at the very least. Thanks!

nibrasmuhammed commented 1 year ago

Hi @neoaggelos, Please provide the full details of the logs you require and steps or commands to get those. thanks!

neoaggelos commented 1 year ago

Hi @nibrasmuhammed

Please share an inspection tarball, which you can create by running microk8s inspect

nibrasmuhammed commented 1 year ago

Hi @neoaggelos , please find the attachment.

inspection-report-20231115_171902.tar.gz

tszyszko commented 12 months ago

I seem to be getting into this state reliably after a few hours of fresh cluster creation, not really sure where to look

│ Warning FailedCreatePodSandBox 83s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "62356903f11f2c438f14ad416a89 │ │ 5c825b474548932d1a5c64e2030b7ed9e9e1": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized │ │ Normal SandboxChanged 2s (x7 over 83s) kubelet Pod sandbox changed, it will be killed and re-created.

tszyszko commented 12 months ago

In my case seem related to the service account token Unable to authenticate the request" err="[invalid bearer token, service account token has expired]

tszyszko commented 12 months ago

This appears to have been caused by remnants of the older version of calico that ships with microk8s (described in /var/snap/microk8s/current/args/cni-network/cni.yaml) after attempting to upgrade, it seems to work for a while until new calico presumably tries refreshing its token. Ensuring that this was properly removed fixed my issues

nibrasmuhammed commented 12 months ago

Hi @tszyszko , I installed microk8s with homebrew on my mac. I usually keep it up-to-date. still having the same problem. could you please elaborate the steps you have followed.

thanks!

itsyoshio commented 6 months ago

@tszyszko +1

PeterOscarsson commented 2 months ago

@tszyszko , would it be possible to share the steps you took to solve this, or are this to long time ago??

TIA

tszyszko commented 2 months ago

Sorry I can't remember exact steps I took here as it was a while back