senthilrch / kube-fledged

A kubernetes operator for creating and managing a cache of container images directly on the cluster worker nodes, so application pods start almost instantly
Apache License 2.0
1.24k stars 118 forks source link

Failed To Create Pod Sandbox : Context Deadline Exceeded #194

Closed ChevronTango closed 1 year ago

ChevronTango commented 1 year ago

We are seeing an issue with our cluster whereby the jobs created by the image-cache are stuck in PodInitializing before erroring with the following.

Normal   Scheduled   4m5s   default-scheduler    Successfully assigned fledged/my-cache-c4tw5-qfdr5 to ip-10-1-1-118.eu-west-2.compute.internal
Warning   FailedCreatePodSandBox    5s    kubelet    Failed to create pod sandbox: rpc error: code = DeadlineExceeded desc = context deadline exceeded
Warning   FailedCreatePodSandBox    5s    kubelet    Failed to create pod sandbox: rpc error code = Unknown desc = failed to reserve sandbox name "my-cache-c4tw5-qfdr5_fledged_46f063f3-e6ef-4160-a0c2-c4c9458ae967_0" is reserved for "62d706b05a7321e9067f83cf5db9b8ec1714a38668a01eb357a47895bc3e7980"

My brief googling suggested this was something related to containerd but I've been unable to source the issue.

We are running EKS with kubernetes 1.21 with ubuntu nodes and containerd 1.5.8

it's worth adding that it's not just the jobs that fail. we are seeing several pods in our cluster all failing with this same error since installing fledged.

senthilrch commented 1 year ago

The error you have reported is not due to kube-fledged.