fluxcd / flagger

Progressive delivery Kubernetes operator (Canary, A/B Testing and Blue/Green deployments)
https://docs.flagger.app
Apache License 2.0
4.79k stars 718 forks source link

Error reported about worker node containerd when using flagger #1551

Open ForcemCS opened 7 months ago

ForcemCS commented 7 months ago

`Nov 17 10:53:11 node03 containerd[753]: 2023-11-17 10:53:11.835 [INFO][2282335] k8s.go 489: Wrote updated endpoint to datastore ContainerID="e193a155ce714a668822bb617657111989ac552ea6b822a477870264717ebc65" Namespace="instavote" Pod="vote-cbbdf4b99-5f4bs" WorkloadEndpoint="node03-k8s-vote--cbbdf4b99--5f4bs-eth0" Nov 17 10:53:11 node03 containerd[753]: time="2023-11-17T10:53:11.859776246+08:00" level=info msg="loading plugin \"io.containerd.event.v1.publisher\"..." runtime=io.containerd.runc.v2 type=io.containerd.event.v1

Nov 17 10:53:11 node03 containerd[753]: time="2023-11-17T10:53:11.859853607+08:00" level=info msg="loading plugin \"io.containerd.internal.v1.shutdown\"..." runtime=io.containerd.runc.v2 type=io.containerd.internal.v1

Nov 17 10:53:11 node03 containerd[753]: time="2023-11-17T10:53:11.859869287+08:00" level=info msg="loading plugin \"io.containerd.ttrpc.v1.task\"..." runtime=io.containerd.runc.v2 type=io.containerd.ttrpc.v1

Nov 17 10:53:11 node03 containerd[753]: time="2023-11-17T10:53:11.859948135+08:00" level=info msg="starting signal loop" namespace=k8s.io path=/run/containerd/io.containerd.runtime.v2.task/k8s.io/e193a155ce714a668822bb617657111989ac552ea6b822a477870264717ebc65 pid=2282378 runtime=io.containerd.runc.v2

Nov 17 10:53:11 node03 containerd[753]: time="2023-11-17T10:53:11.932893809+08:00" level=warning msg="error from *cgroupsv2.Manager.EventChan" error="failed to create inotify fd"

Nov 17 10:53:11 node03 containerd[753]: time="2023-11-17T10:53:11.933311784+08:00" level=info msg="RunPodSandbox for &PodSandboxMetadata{Name:vote-cbbdf4b99-5f4bs,Uid:d52c6c5c-d3e7-4b62-9ac6-fe8715405580,Namespace:instavote,Attempt:0,} returns sandbox id \"e193a155ce714a668822bb617657111989ac552ea6b822a477870264717ebc65\"" `

Description of the problem

I'm currently following the official civilised guide to flagger, and here's the thing: let's say I'm in the middle of Blue/Green and the analysis doesn't go through, so it triggers a rollback. But when I use the command kubectl -n instavote set image deploy..... again When triggering the canary version, some of the working nodes will report the error above, and this problem occurs randomly. How to fix this issue

stefanprodan commented 7 months ago

You may want to reach out to the containerd folks, Flagger doesn't interact with containerd, the actual pods are managed by Kubernetes deployments and kubelet.