canonical / microk8s

MicroK8s is a small, fast, single-package Kubernetes for datacenters and the edge.
https://microk8s.io
Apache License 2.0
8.4k stars 766 forks source link

"Failed to create existing container" errors in microk8s.daemon-kubelet category #2659

Closed joes closed 2 years ago

joes commented 2 years ago

I am getting a lot of "Failed to create existing container" errors in the logs.

$ sudo journalctl --since "1 minute ago" | grep "Failed to create existing container"

Oct 15 09:23:58 khsrv1124.bro.intra microk8s.daemon-kubelet[635728]: E1015 09:23:58.797742  635728 manager.go:1123] Failed to create existing container: /kubepods/besteffort/podba7a2489-8613-4c1c-9262-1189d13dee46/05416524c27a64edd53d7456651f27ede44d1319201ee5ced4a5748780e079a6: task 05416524c27a64edd53d7456651f27ede44d1319201ee5ced4a5748780e079a6 not found: not found
Oct 15 09:24:00 khsrv1124.bro.intra microk8s.daemon-kubelet[635728]: E1015 09:24:00.304403  635728 manager.go:1123] Failed to create existing container: /kubepods/besteffort/pod7b178d8b-4a05-47ec-88dd-7151d36a9ca1/c68f13c9a5a1e287e3c5ad0098a046d0adac4bb04a324a71e2a0640a80096fc1: task c68f13c9a5a1e287e3c5ad0098a046d0adac4bb04a324a71e2a0640a80096fc1 not found: not found
Oct 15 09:24:01 khsrv1124.bro.intra microk8s.daemon-kubelet[635728]: E1015 09:24:01.819916  635728 manager.go:1123] Failed to create existing container: /kubepods/besteffort/pod14d87a31-8eb5-4268-850f-4cb03b7d5d95/2b4137ed16a3730122d7bab6d9be9d71db18644a97fd6da7e7c1874ea544e7f2: task 2b4137ed16a3730122d7bab6d9be9d71db18644a97fd6da7e7c1874ea544e7f2 not found: not found
Oct 15 09:24:03 khsrv1124.bro.intra microk8s.daemon-kubelet[635728]: E1015 09:24:03.325159  635728 manager.go:1123] Failed to create existing container: /kubepods/besteffort/pod7b178d8b-4a05-47ec-88dd-7151d36a9ca1/80c83f325c08fddfccebbe084e59c51ff12e4dfc746d2ee4fa536c515087bd1e: task 80c83f325c08fddfccebbe084e59c51ff12e4dfc746d2ee4fa536c515087bd1e not found: not found
Oct 15 09:24:04 khsrv1124.bro.intra microk8s.daemon-kubelet[635728]: E1015 09:24:04.835003  635728 manager.go:1123] Failed to create existing container: /kubepods/besteffort/pod458cfa30-2c7d-4c0d-928e-3482f5e3cdb5/339e0340b8c7a287579bbeb8d009e0921028121094619c4718e2f1f45c6ba340: task 339e0340b8c7a287579bbeb8d009e0921028121094619c4718e2f1f45c6ba340 not found: not found
Oct 15 09:24:06 khsrv1124.bro.intra microk8s.daemon-kubelet[635728]: E1015 09:24:06.343129  635728 manager.go:1123] Failed to create existing container: /kubepods/besteffort/pod1022de4b-c236-4cb2-9f4b-ec7035fb0848/b6ccafa032c546398279143c14ecab1f65775529ee70224dd9865d5fab8f4b50: task b6ccafa032c546398279143c14ecab1f65775529ee70224dd9865d5fab8f4b50 not found: not found
Oct 15 09:24:07 khsrv1124.bro.intra microk8s.daemon-kubelet[635728]: E1015 09:24:07.854916  635728 manager.go:1123] Failed to create existing container: /kubepods/besteffort/pod458cfa30-2c7d-4c0d-928e-3482f5e3cdb5/7952e594421cbed077df584da648b00cbe6394be9c45056eb12f4ee3763e7523: task 7952e594421cbed077df584da648b00cbe6394be9c45056eb12f4ee3763e7523 not found: not found
Oct 15 09:24:09 khsrv1124.bro.intra microk8s.daemon-kubelet[635728]: E1015 09:24:09.360865  635728 manager.go:1123] Failed to create existing container: /kubepods/besteffort/pod14d87a31-8eb5-4268-850f-4cb03b7d5d95/ac8ab1a1052f2fe18904fff79f25f1042d6f38245b7c77652f6205e3944dbe59: task ac8ab1a1052f2fe18904fff79f25f1042d6f38245b7c77652f6205e3944dbe59 not found: not found
Oct 15 09:24:10 khsrv1124.bro.intra microk8s.daemon-kubelet[635728]: E1015 09:24:10.867164  635728 manager.go:1123] Failed to create existing container: /kubepods/besteffort/pod1022de4b-c236-4cb2-9f4b-ec7035fb0848/d6e8d65d0b59b952f7d87d869a525165f654df861397ebd7049a82a0b96ab193: task d6e8d65d0b59b952f7d87d869a525165f654df861397ebd7049a82a0b96ab193 not found: not found
Oct 15 09:24:12 khsrv1124.bro.intra microk8s.daemon-kubelet[635728]: E1015 09:24:12.372701  635728 manager.go:1123] Failed to create existing container: /kubepods/besteffort/podba7a2489-8613-4c1c-9262-1189d13dee46/cdd705b7c22879b089804277e062ecfef68eed1efce87e9262d6bea9b1890ce1: task cdd705b7c22879b089804277e062ecfef68eed1efce87e9262d6bea9b1890ce1 not found: not found

If I parse the logs and feed the ID:s to microk8s ctr c info I find these containers and related images:

```bash sudo journalctl --since "10 minutes ago" \ | grep "Failed to create existing container" \ | grep "task" \ | awk -F ': task ' '{print $2}' \ | awk -F ' not found: ' '{print $1}' \ | sort -u \ | xargs -I {} -n 1 bash -c "echo Failed to create existing container {}; microk8s ctr c info {} | grep {}; echo ''" ```

Output:

Failed to create existing container 05416524c27a64edd53d7456651f27ede44d1319201ee5ced4a5748780e079a6
    "ID": "05416524c27a64edd53d7456651f27ede44d1319201ee5ced4a5748780e079a6",
    "SnapshotKey": "05416524c27a64edd53d7456651f27ede44d1319201ee5ced4a5748780e079a6",
                "source": "/var/snap/microk8s/common/run/containerd/io.containerd.grpc.v1.cri/sandboxes/05416524c27a64edd53d7456651f27ede44d1319201ee5ced4a5748780e079a6/shm",
            "io.kubernetes.cri.sandbox-id": "05416524c27a64edd53d7456651f27ede44d1319201ee5ced4a5748780e079a6",
            "cgroupsPath": "/kubepods/besteffort/podba7a2489-8613-4c1c-9262-1189d13dee46/05416524c27a64edd53d7456651f27ede44d1319201ee5ced4a5748780e079a6",

Failed to create existing container 2b4137ed16a3730122d7bab6d9be9d71db18644a97fd6da7e7c1874ea544e7f2
    "ID": "2b4137ed16a3730122d7bab6d9be9d71db18644a97fd6da7e7c1874ea544e7f2",
    "SnapshotKey": "2b4137ed16a3730122d7bab6d9be9d71db18644a97fd6da7e7c1874ea544e7f2",
            "cgroupsPath": "/kubepods/besteffort/pod14d87a31-8eb5-4268-850f-4cb03b7d5d95/2b4137ed16a3730122d7bab6d9be9d71db18644a97fd6da7e7c1874ea544e7f2",

Failed to create existing container 339e0340b8c7a287579bbeb8d009e0921028121094619c4718e2f1f45c6ba340
    "ID": "339e0340b8c7a287579bbeb8d009e0921028121094619c4718e2f1f45c6ba340",
    "SnapshotKey": "339e0340b8c7a287579bbeb8d009e0921028121094619c4718e2f1f45c6ba340",
            "cgroupsPath": "/kubepods/besteffort/pod458cfa30-2c7d-4c0d-928e-3482f5e3cdb5/339e0340b8c7a287579bbeb8d009e0921028121094619c4718e2f1f45c6ba340",

Failed to create existing container 7952e594421cbed077df584da648b00cbe6394be9c45056eb12f4ee3763e7523
    "ID": "7952e594421cbed077df584da648b00cbe6394be9c45056eb12f4ee3763e7523",
    "SnapshotKey": "7952e594421cbed077df584da648b00cbe6394be9c45056eb12f4ee3763e7523",
                "source": "/var/snap/microk8s/common/run/containerd/io.containerd.grpc.v1.cri/sandboxes/7952e594421cbed077df584da648b00cbe6394be9c45056eb12f4ee3763e7523/shm",
            "io.kubernetes.cri.sandbox-id": "7952e594421cbed077df584da648b00cbe6394be9c45056eb12f4ee3763e7523",
            "cgroupsPath": "/kubepods/besteffort/pod458cfa30-2c7d-4c0d-928e-3482f5e3cdb5/7952e594421cbed077df584da648b00cbe6394be9c45056eb12f4ee3763e7523",

Failed to create existing container 80c83f325c08fddfccebbe084e59c51ff12e4dfc746d2ee4fa536c515087bd1e
    "ID": "80c83f325c08fddfccebbe084e59c51ff12e4dfc746d2ee4fa536c515087bd1e",
    "SnapshotKey": "80c83f325c08fddfccebbe084e59c51ff12e4dfc746d2ee4fa536c515087bd1e",
                "source": "/var/snap/microk8s/common/run/containerd/io.containerd.grpc.v1.cri/sandboxes/80c83f325c08fddfccebbe084e59c51ff12e4dfc746d2ee4fa536c515087bd1e/shm",
            "io.kubernetes.cri.sandbox-id": "80c83f325c08fddfccebbe084e59c51ff12e4dfc746d2ee4fa536c515087bd1e",
            "cgroupsPath": "/kubepods/besteffort/pod7b178d8b-4a05-47ec-88dd-7151d36a9ca1/80c83f325c08fddfccebbe084e59c51ff12e4dfc746d2ee4fa536c515087bd1e",

Failed to create existing container ac8ab1a1052f2fe18904fff79f25f1042d6f38245b7c77652f6205e3944dbe59
    "ID": "ac8ab1a1052f2fe18904fff79f25f1042d6f38245b7c77652f6205e3944dbe59",
    "SnapshotKey": "ac8ab1a1052f2fe18904fff79f25f1042d6f38245b7c77652f6205e3944dbe59",
                "source": "/var/snap/microk8s/common/run/containerd/io.containerd.grpc.v1.cri/sandboxes/ac8ab1a1052f2fe18904fff79f25f1042d6f38245b7c77652f6205e3944dbe59/shm",
            "io.kubernetes.cri.sandbox-id": "ac8ab1a1052f2fe18904fff79f25f1042d6f38245b7c77652f6205e3944dbe59",
            "cgroupsPath": "/kubepods/besteffort/pod14d87a31-8eb5-4268-850f-4cb03b7d5d95/ac8ab1a1052f2fe18904fff79f25f1042d6f38245b7c77652f6205e3944dbe59",

Failed to create existing container b6ccafa032c546398279143c14ecab1f65775529ee70224dd9865d5fab8f4b50
    "ID": "b6ccafa032c546398279143c14ecab1f65775529ee70224dd9865d5fab8f4b50",
    "SnapshotKey": "b6ccafa032c546398279143c14ecab1f65775529ee70224dd9865d5fab8f4b50",
                "source": "/var/snap/microk8s/common/run/containerd/io.containerd.grpc.v1.cri/sandboxes/b6ccafa032c546398279143c14ecab1f65775529ee70224dd9865d5fab8f4b50/shm",
            "io.kubernetes.cri.sandbox-id": "b6ccafa032c546398279143c14ecab1f65775529ee70224dd9865d5fab8f4b50",
            "cgroupsPath": "/kubepods/besteffort/pod1022de4b-c236-4cb2-9f4b-ec7035fb0848/b6ccafa032c546398279143c14ecab1f65775529ee70224dd9865d5fab8f4b50",

Failed to create existing container c68f13c9a5a1e287e3c5ad0098a046d0adac4bb04a324a71e2a0640a80096fc1
    "ID": "c68f13c9a5a1e287e3c5ad0098a046d0adac4bb04a324a71e2a0640a80096fc1",
    "SnapshotKey": "c68f13c9a5a1e287e3c5ad0098a046d0adac4bb04a324a71e2a0640a80096fc1",
            "cgroupsPath": "/kubepods/besteffort/pod7b178d8b-4a05-47ec-88dd-7151d36a9ca1/c68f13c9a5a1e287e3c5ad0098a046d0adac4bb04a324a71e2a0640a80096fc1",

Failed to create existing container cdd705b7c22879b089804277e062ecfef68eed1efce87e9262d6bea9b1890ce1
    "ID": "cdd705b7c22879b089804277e062ecfef68eed1efce87e9262d6bea9b1890ce1",
    "SnapshotKey": "cdd705b7c22879b089804277e062ecfef68eed1efce87e9262d6bea9b1890ce1",
            "cgroupsPath": "/kubepods/besteffort/podba7a2489-8613-4c1c-9262-1189d13dee46/cdd705b7c22879b089804277e062ecfef68eed1efce87e9262d6bea9b1890ce1",

Failed to create existing container d6e8d65d0b59b952f7d87d869a525165f654df861397ebd7049a82a0b96ab193
    "ID": "d6e8d65d0b59b952f7d87d869a525165f654df861397ebd7049a82a0b96ab193",
    "SnapshotKey": "d6e8d65d0b59b952f7d87d869a525165f654df861397ebd7049a82a0b96ab193",
            "cgroupsPath": "/kubepods/besteffort/pod1022de4b-c236-4cb2-9f4b-ec7035fb0848/d6e8d65d0b59b952f7d87d869a525165f654df861397ebd7049a82a0b96ab193",

What might be the cause of this? Any help or pointers are much appreciated.

$ microk8s kubectl version

``` Client Version: version.Info{Major:"1", Minor:"20+", GitVersion:"v1.20.11-34+7402e007632498", GitCommit:"7402e007632498c9b5b4f9da672aa2be7b382f2a", GitTreeState:"clean", BuildDate:"2021-09-28T12:30:58Z", GoVersion:"go1.15.15", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"20+", GitVersion:"v1.20.11-34+7402e007632498", GitCommit:"7402e007632498c9b5b4f9da672aa2be7b382f2a", GitTreeState:"clean", BuildDate:"2021-09-28T12:34:15Z", GoVersion:"go1.15.15", Compiler:"gc", Platform:"linux/amd64"} ```

$ microk8s status

``` microk8s is running high-availability: yes datastore master nodes: 10.10.11.101:19001 10.10.11.28:19001 10.10.11.110:19001 datastore standby nodes: none addons: enabled: dashboard # The Kubernetes dashboard dns # CoreDNS ha-cluster # Configure high availability on the current node ingress # Ingress controller for external access metrics-server # K8s Metrics Server for API access to service metrics rbac # Role-Based Access Control for authorisation storage # Storage class; allocates storage from host directory ```

$ microk8s kubectl get nodes

``` NAME STATUS ROLES AGE VERSION khsrv1094 Ready 30d v1.20.11-34+7402e007632498 khsrv1124.bro.intra Ready 30d v1.20.11-34+7402e007632498 khsrv1131.bro.intra Ready 30d v1.20.11-34+7402e007632498 ```
joes commented 2 years ago

I "fixed" the issue by draining, restarting server, and uncordoning all nodes.

An "interesting" (or totally expected thing) is that microk8s ctr c info [id] no longer provides any information about the containers that failed to create previously.

sudo journalctl --since "10 minutes ago" \
> | grep "Failed to create existing container" \
> | grep "task" \
> | awk -F ': task ' '{print $2}' \
> | awk -F ' not found: ' '{print $1}' \
> | sort -u \
> | xargs -I {} -n 1 bash -c "echo Failed to create existing container {}; microk8s ctr c list | grep {}; echo ''"
Failed to create existing container 1da2b0e7abac49f5b0a59f625410d02e5d09b23588f5a8136eba980971e7a772

Failed to create existing container 286ebbf78e48b36dd32f01a7324dcebb424c2a09109eaea273b321d27310801e

Failed to create existing container 37bb5ef545f621987209d3abe1e877edbb1ff6243ba4c5eeb3bc2fbb7affeb15

Failed to create existing container 4b48154390c73e14da20f37624dd37b15449276639f98c8ac2eb822914fbf115

Failed to create existing container 5b1d0ac2dbd040707934b772722a4ffc777480c60576f0bf11be300e2e3e567a

Failed to create existing container 5e5558e9f2a4b3c12a08114945c747ec3254ced3cd811d68c6cb6d22bdda67c3

Failed to create existing container 6ced94e2753519017411d4b0d05cc7f51576b15d63eb1024483ef7e571ba458c

Failed to create existing container 7a999c211e5643861782b80a0d4369bac8db1e5fa80eb1b08d1b123ac7c12799

Failed to create existing container 8a5a50caef481ad5f099b4c7d18daae34caf97220e8cc8702f50d7833f515866

Failed to create existing container 9bc948c868d2e39cba5d3169a2590b10f4cb5bd0c5860a5504ea2e8649f1090b

Failed to create existing container c96c113358b1370733c3ae41af416045292d7580887c542dcd00d7d04ef124e9

Failed to create existing container d6f1b4c1e7f6597f72a8c54c2ba7b89e51356bdc52263e99c457e39480465b7b

Failed to create existing container ec631ea825dbce4496ca8be2c12a89b5f55a0489f456344128d84c2d972e104a

Failed to create existing container edd488456797599d06c16aefcd2e92ac5c0c6e827b9eafe7ca01f1145ce20e4a

So something (possibly out-dated) regarding containers seems to have been cleared/reset here after the restart.

joes commented 2 years ago

Seems to be something tempory. Cannot reproduce so closing.

joes commented 2 years ago

Once again I am seeing the following messages in the logs on all three nodes in my cluster. This was on channel=1.20/stable and I just now refreshed to channel=1.20/edge one of the nodes but the error persists.

Dec 07 12:59:42 khsrv1094 microk8s.daemon-kubelet[2848941]: E1207 12:59:42.024826 2848941 manager.go:1123] Failed to create existing container: /kubepods/besteffort/pod3cd19c0e-10ea-42bf-b36f-e11918a9e92d/efc106f59b02f818827686943cab5c074078ebdedd5c9cd556faee9850788a5c: task efc106f59b02f818827686943cab5c074078ebdedd5c9cd556faee9850788a5c not found: not found

So, this is not something temporary but recurring. What might this be about?

Last time this error stopped after rebooting the server. Do I need to schedule a nightly restart?

joes commented 2 years ago

More information about some of the failed containers obtained via microk8s ctr c info [id]

Failed to create existing container 193481ca9fe296eeb61d8d18ab654044542f1543a763152f2c48232286d1b4b3
    "ID": "193481ca9fe296eeb61d8d18ab654044542f1543a763152f2c48232286d1b4b3",
    "SnapshotKey": "193481ca9fe296eeb61d8d18ab654044542f1543a763152f2c48232286d1b4b3",
            "cgroupsPath": "/kubepods/besteffort/pod33c9ff61-13b4-4fc0-a810-d049f33fa24f/193481ca9fe296eeb61d8d18ab654044542f1543a763152f2c48232286d1b4b3",

Failed to create existing container 4543654fb7613e0fa3cfa4655708d80ef1ca2b77af5c4081a9c8c95a4bc15f55
    "ID": "4543654fb7613e0fa3cfa4655708d80ef1ca2b77af5c4081a9c8c95a4bc15f55",
    "SnapshotKey": "4543654fb7613e0fa3cfa4655708d80ef1ca2b77af5c4081a9c8c95a4bc15f55",
            "cgroupsPath": "/kubepods/besteffort/pod3cd19c0e-10ea-42bf-b36f-e11918a9e92d/4543654fb7613e0fa3cfa4655708d80ef1ca2b77af5c4081a9c8c95a4bc15f55",

Failed to create existing container 930dbe725a0a48d92d52575f3fab84838b3c60817f1253bfc8c2b4dda5874f2e
    "ID": "930dbe725a0a48d92d52575f3fab84838b3c60817f1253bfc8c2b4dda5874f2e",
    "SnapshotKey": "930dbe725a0a48d92d52575f3fab84838b3c60817f1253bfc8c2b4dda5874f2e",
            "cgroupsPath": "/kubepods/besteffort/pod3eeface5-9b4f-41af-b2c6-d027a2bfb26a/930dbe725a0a48d92d52575f3fab84838b3c60817f1253bfc8c2b4dda5874f2e",

Failed to create existing container 9c4af8286c784769c3c95d949808adc15a8a008b85a5661949fcab32693bfb90
    "ID": "9c4af8286c784769c3c95d949808adc15a8a008b85a5661949fcab32693bfb90",
    "SnapshotKey": "9c4af8286c784769c3c95d949808adc15a8a008b85a5661949fcab32693bfb90",
                "source": "/var/snap/microk8s/common/run/containerd/io.containerd.grpc.v1.cri/sandboxes/9c4af8286c784769c3c95d949808adc15a8a008b85a5661949fcab32693bfb90/shm",
            "io.kubernetes.cri.sandbox-id": "9c4af8286c784769c3c95d949808adc15a8a008b85a5661949fcab32693bfb90",
            "cgroupsPath": "/kubepods/besteffort/pod3eeface5-9b4f-41af-b2c6-d027a2bfb26a/9c4af8286c784769c3c95d949808adc15a8a008b85a5661949fcab32693bfb90",

Failed to create existing container ba3d4c141ddf381ba7645aff7dfd8cf41340276594525d176bbf3ea2fed4a002
    "ID": "ba3d4c141ddf381ba7645aff7dfd8cf41340276594525d176bbf3ea2fed4a002",
    "SnapshotKey": "ba3d4c141ddf381ba7645aff7dfd8cf41340276594525d176bbf3ea2fed4a002",
                "source": "/var/snap/microk8s/common/run/containerd/io.containerd.grpc.v1.cri/sandboxes/ba3d4c141ddf381ba7645aff7dfd8cf41340276594525d176bbf3ea2fed4a002/shm",
            "io.kubernetes.cri.sandbox-id": "ba3d4c141ddf381ba7645aff7dfd8cf41340276594525d176bbf3ea2fed4a002",
            "cgroupsPath": "/kubepods/besteffort/pod33c9ff61-13b4-4fc0-a810-d049f33fa24f/ba3d4c141ddf381ba7645aff7dfd8cf41340276594525d176bbf3ea2fed4a002",

Failed to create existing container c691e4f25ed3b6eebd80579881debf7560794956f1e022bb555be5b11c1579e4
    "ID": "c691e4f25ed3b6eebd80579881debf7560794956f1e022bb555be5b11c1579e4",
    "SnapshotKey": "c691e4f25ed3b6eebd80579881debf7560794956f1e022bb555be5b11c1579e4",
                "source": "/var/snap/microk8s/common/run/containerd/io.containerd.grpc.v1.cri/sandboxes/c691e4f25ed3b6eebd80579881debf7560794956f1e022bb555be5b11c1579e4/shm",
            "io.kubernetes.cri.sandbox-id": "c691e4f25ed3b6eebd80579881debf7560794956f1e022bb555be5b11c1579e4",
            "cgroupsPath": "/kubepods/besteffort/pod3cd19c0e-10ea-42bf-b36f-e11918a9e92d/c691e4f25ed3b6eebd80579881debf7560794956f1e022bb555be5b11c1579e4",
Azbesciak commented 2 weeks ago

I have the same behavior for 2 weeks, any solution?

joes commented 2 weeks ago

None that I know off. I observed that rebooting the nodes helped temporarily (but the messages reappeared later on).