Open amitpd opened 1 year ago
I am also facing the same issue on Amazon EKS cluster (1.27), it works correctly on v 1.24.
Same here ! With litmus 3.0.0-beta8 (and reproduced on 3.0.0-beta7 too)
on EKS 1.27 and was working fine on 1.26
Might this be related to ---container-runtime
deprecation since 1.24 and removed in 1.27 ?
here on kube release notes
Able to reproduce on Minikube + containerd + litmus 3.0.0-beta8
All chaos experiments requiring container runtime working fine
case 2 : error group running kubernetes 1.27
Helper instantly killed
We'll stick our clusters on kube 1.26.X (less than 1.27) on our side for now but please Harness/Litmus team have a look at https://kubernetes.io/blog/2023/03/17/upcoming-changes-in-kubernetes-v1-27/#removal-of-container-runtime-command-line-argument
This is fixed in 3.00beta10 via https://github.com/litmuschaos/litmus-go/pull/665
In 2.14.1 via https://github.com/litmuschaos/litmus-go/pull/669
This is fixed in 3.00beta10 via litmuschaos/litmus-go#665
In 2.14.1 via litmuschaos/litmus-go#669
Based on the PRs, how does deleting labels fix the issue? The release notes state a kubelet flag but I don't see how that would impact starting the helper pods via the k8s API?
EDIT: Or is it related to the standard labels that are added to pods since 1.27
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.27.md#api-change-4
Pods owned by a Job now uses the labels batch.kubernetes.io/job-name and batch.kubernetes.io/controller-uid. The legacy labels job-name and controller-uid are still added for compatibility. (#114930, @kannon92)
@ksatchit can the 2.14.1 be pushed to dockerhub? The only other solution is using 3.x which is big change (and I am yet to have it fully working..)
What happened: LitmusChaos tests not running properly on Kubernetes v1.27
What you expected to happen: LitmusChaos tests should run properly on Kubernetes v1.27
Where can this issue be corrected? (optional)
The issue is probably in the source code of
litmuschaos/go-runner:2.14.0
How to reproduce it (as minimally and precisely as possible): Note: Followed the instructions as per https://litmuschaos.github.io/litmus/experiments/categories/pods/pod-cpu-hog/.
Deploy litmus operator v2.14.0
Deploy below ChaosExperiment:
Create below RBAC:
Deploy below ChaosEngine:
Anything else we need to know?: Log of
pod-cpu-hog-vczplk-d5fsw
pod created during experiemnt:Events from the Job that creates
pod-cpu-hog-vczplk-d5fsw
pod:It seems like the helper pod is getting deleted immediately after it is created.