Litmus helps SREs and developers practice chaos engineering in a Cloud-native way. Chaos experiments are published at the ChaosHub (https://hub.litmuschaos.io). Community notes is at https://hackmd.io/a4Zu_sH4TZGeih-xCimi3Q
What happened:
Injected chaos could be orphaned in the cluster if the network helper pods are killed before graceful completion. This is because internally the network experiment uses tc twice to inject the latency at the beginning and revert it at the end. If the helper pod is forcedly killed unexpectedly for whatever reason, the impact of the first tc command would be there in the cluster forever unless you restart the affected pod.
What you expected to happen:
After the chaos duration, the injected chaos should always be reverted.
How to reproduce it (as minimally and precisely as possible):
Delete the helper pod (the pod that runs tc in the affected pod's network namespace) by kubectl delete pod --force --grace-period=0 <helper_pod_name>.
The injected chaos can not be reverted.
Anything else we need to know?:
The container runtime is docker
What happened: Injected chaos could be orphaned in the cluster if the network helper pods are killed before graceful completion. This is because internally the network experiment uses tc twice to inject the latency at the beginning and revert it at the end. If the helper pod is forcedly killed unexpectedly for whatever reason, the impact of the first tc command would be there in the cluster forever unless you restart the affected pod.
What you expected to happen: After the chaos duration, the injected chaos should always be reverted.
How to reproduce it (as minimally and precisely as possible):
kubectl delete pod --force --grace-period=0 <helper_pod_name>
. The injected chaos can not be reverted.Anything else we need to know?: The container runtime is docker