Closed joes closed 2 years ago
I "fixed" the issue by draining, restarting server, and uncordoning all nodes.
An "interesting" (or totally expected thing) is that microk8s ctr c info [id]
no longer provides any information about the containers that failed to create previously.
sudo journalctl --since "10 minutes ago" \
> | grep "Failed to create existing container" \
> | grep "task" \
> | awk -F ': task ' '{print $2}' \
> | awk -F ' not found: ' '{print $1}' \
> | sort -u \
> | xargs -I {} -n 1 bash -c "echo Failed to create existing container {}; microk8s ctr c list | grep {}; echo ''"
Failed to create existing container 1da2b0e7abac49f5b0a59f625410d02e5d09b23588f5a8136eba980971e7a772
Failed to create existing container 286ebbf78e48b36dd32f01a7324dcebb424c2a09109eaea273b321d27310801e
Failed to create existing container 37bb5ef545f621987209d3abe1e877edbb1ff6243ba4c5eeb3bc2fbb7affeb15
Failed to create existing container 4b48154390c73e14da20f37624dd37b15449276639f98c8ac2eb822914fbf115
Failed to create existing container 5b1d0ac2dbd040707934b772722a4ffc777480c60576f0bf11be300e2e3e567a
Failed to create existing container 5e5558e9f2a4b3c12a08114945c747ec3254ced3cd811d68c6cb6d22bdda67c3
Failed to create existing container 6ced94e2753519017411d4b0d05cc7f51576b15d63eb1024483ef7e571ba458c
Failed to create existing container 7a999c211e5643861782b80a0d4369bac8db1e5fa80eb1b08d1b123ac7c12799
Failed to create existing container 8a5a50caef481ad5f099b4c7d18daae34caf97220e8cc8702f50d7833f515866
Failed to create existing container 9bc948c868d2e39cba5d3169a2590b10f4cb5bd0c5860a5504ea2e8649f1090b
Failed to create existing container c96c113358b1370733c3ae41af416045292d7580887c542dcd00d7d04ef124e9
Failed to create existing container d6f1b4c1e7f6597f72a8c54c2ba7b89e51356bdc52263e99c457e39480465b7b
Failed to create existing container ec631ea825dbce4496ca8be2c12a89b5f55a0489f456344128d84c2d972e104a
Failed to create existing container edd488456797599d06c16aefcd2e92ac5c0c6e827b9eafe7ca01f1145ce20e4a
So something (possibly out-dated) regarding containers seems to have been cleared/reset here after the restart.
Seems to be something tempory. Cannot reproduce so closing.
Once again I am seeing the following messages in the logs on all three nodes in my cluster. This was on channel=1.20/stable and I just now refreshed to channel=1.20/edge one of the nodes but the error persists.
Dec 07 12:59:42 khsrv1094 microk8s.daemon-kubelet[2848941]: E1207 12:59:42.024826 2848941 manager.go:1123] Failed to create existing container: /kubepods/besteffort/pod3cd19c0e-10ea-42bf-b36f-e11918a9e92d/efc106f59b02f818827686943cab5c074078ebdedd5c9cd556faee9850788a5c: task efc106f59b02f818827686943cab5c074078ebdedd5c9cd556faee9850788a5c not found: not found
So, this is not something temporary but recurring. What might this be about?
Last time this error stopped after rebooting the server. Do I need to schedule a nightly restart?
More information about some of the failed containers obtained via microk8s ctr c info [id]
Failed to create existing container 193481ca9fe296eeb61d8d18ab654044542f1543a763152f2c48232286d1b4b3
"ID": "193481ca9fe296eeb61d8d18ab654044542f1543a763152f2c48232286d1b4b3",
"SnapshotKey": "193481ca9fe296eeb61d8d18ab654044542f1543a763152f2c48232286d1b4b3",
"cgroupsPath": "/kubepods/besteffort/pod33c9ff61-13b4-4fc0-a810-d049f33fa24f/193481ca9fe296eeb61d8d18ab654044542f1543a763152f2c48232286d1b4b3",
Failed to create existing container 4543654fb7613e0fa3cfa4655708d80ef1ca2b77af5c4081a9c8c95a4bc15f55
"ID": "4543654fb7613e0fa3cfa4655708d80ef1ca2b77af5c4081a9c8c95a4bc15f55",
"SnapshotKey": "4543654fb7613e0fa3cfa4655708d80ef1ca2b77af5c4081a9c8c95a4bc15f55",
"cgroupsPath": "/kubepods/besteffort/pod3cd19c0e-10ea-42bf-b36f-e11918a9e92d/4543654fb7613e0fa3cfa4655708d80ef1ca2b77af5c4081a9c8c95a4bc15f55",
Failed to create existing container 930dbe725a0a48d92d52575f3fab84838b3c60817f1253bfc8c2b4dda5874f2e
"ID": "930dbe725a0a48d92d52575f3fab84838b3c60817f1253bfc8c2b4dda5874f2e",
"SnapshotKey": "930dbe725a0a48d92d52575f3fab84838b3c60817f1253bfc8c2b4dda5874f2e",
"cgroupsPath": "/kubepods/besteffort/pod3eeface5-9b4f-41af-b2c6-d027a2bfb26a/930dbe725a0a48d92d52575f3fab84838b3c60817f1253bfc8c2b4dda5874f2e",
Failed to create existing container 9c4af8286c784769c3c95d949808adc15a8a008b85a5661949fcab32693bfb90
"ID": "9c4af8286c784769c3c95d949808adc15a8a008b85a5661949fcab32693bfb90",
"SnapshotKey": "9c4af8286c784769c3c95d949808adc15a8a008b85a5661949fcab32693bfb90",
"source": "/var/snap/microk8s/common/run/containerd/io.containerd.grpc.v1.cri/sandboxes/9c4af8286c784769c3c95d949808adc15a8a008b85a5661949fcab32693bfb90/shm",
"io.kubernetes.cri.sandbox-id": "9c4af8286c784769c3c95d949808adc15a8a008b85a5661949fcab32693bfb90",
"cgroupsPath": "/kubepods/besteffort/pod3eeface5-9b4f-41af-b2c6-d027a2bfb26a/9c4af8286c784769c3c95d949808adc15a8a008b85a5661949fcab32693bfb90",
Failed to create existing container ba3d4c141ddf381ba7645aff7dfd8cf41340276594525d176bbf3ea2fed4a002
"ID": "ba3d4c141ddf381ba7645aff7dfd8cf41340276594525d176bbf3ea2fed4a002",
"SnapshotKey": "ba3d4c141ddf381ba7645aff7dfd8cf41340276594525d176bbf3ea2fed4a002",
"source": "/var/snap/microk8s/common/run/containerd/io.containerd.grpc.v1.cri/sandboxes/ba3d4c141ddf381ba7645aff7dfd8cf41340276594525d176bbf3ea2fed4a002/shm",
"io.kubernetes.cri.sandbox-id": "ba3d4c141ddf381ba7645aff7dfd8cf41340276594525d176bbf3ea2fed4a002",
"cgroupsPath": "/kubepods/besteffort/pod33c9ff61-13b4-4fc0-a810-d049f33fa24f/ba3d4c141ddf381ba7645aff7dfd8cf41340276594525d176bbf3ea2fed4a002",
Failed to create existing container c691e4f25ed3b6eebd80579881debf7560794956f1e022bb555be5b11c1579e4
"ID": "c691e4f25ed3b6eebd80579881debf7560794956f1e022bb555be5b11c1579e4",
"SnapshotKey": "c691e4f25ed3b6eebd80579881debf7560794956f1e022bb555be5b11c1579e4",
"source": "/var/snap/microk8s/common/run/containerd/io.containerd.grpc.v1.cri/sandboxes/c691e4f25ed3b6eebd80579881debf7560794956f1e022bb555be5b11c1579e4/shm",
"io.kubernetes.cri.sandbox-id": "c691e4f25ed3b6eebd80579881debf7560794956f1e022bb555be5b11c1579e4",
"cgroupsPath": "/kubepods/besteffort/pod3cd19c0e-10ea-42bf-b36f-e11918a9e92d/c691e4f25ed3b6eebd80579881debf7560794956f1e022bb555be5b11c1579e4",
I have the same behavior for 2 weeks, any solution?
None that I know off. I observed that rebooting the nodes helped temporarily (but the messages reappeared later on).
I am getting a lot of "Failed to create existing container" errors in the logs.
$ sudo journalctl --since "1 minute ago" | grep "Failed to create existing container"
If I parse the logs and feed the ID:s to
microk8s ctr c info
I find these containers and related images:Output:
What might be the cause of this? Any help or pointers are much appreciated.
$ microk8s kubectl version
$ microk8s status
$ microk8s kubectl get nodes