Litmus helps SREs and developers practice chaos engineering in a Cloud-native way. Chaos experiments are published at the ChaosHub (https://hub.litmuschaos.io). Community notes is at https://hackmd.io/a4Zu_sH4TZGeih-xCimi3Q
What happened:
ChaosEngine Event ChaosInjected does not get Reflected in GetExperimentRun API. This netem (network-latency) experiment generates this ChaosInject event when fault is injected. However, I don't see this event being sent by this SendWorkflowUpdates function (by using additional logging locally), also phase and message doesn't get reflected in the GetExperimentRun API. Right now, the example below of ChaosEngine executionData excerpt doesn't contain anything about ChaosInject after fault has been actually injected.
(Truncated)
{\"name\":\"pod-network-loss-1kj\",\"phase\":\"initialized\",\"message\":\"\",\"startedAt\":\"1726772307\",\"finishedAt\":\"\",\"children\":null,\"type\":\"ChaosEngine\",\"chaosData\":{\"engineUID\":\"5467912e-c942-49ec-8754-3fceb552242e\",\"engineContext\":\"\",\"engineName\":\"pod-network-loss-1kjktwcs\",\"namespace\":\"chaos-test-namespace\",\"experimentName\":\"pod-network-loss\",\"experimentStatus\":\"initialized\",\"lastUpdatedAt\":\"1726772335\",\"experimentVerdict\":\"N/A\",\"experimentPod\":\"Yet to be launched\",\"runnerPod\":\"pod-network-loss-1kjktwcs-runner\",\"probeSuccessPercentage\":\"0\",\"failStep\":\"\",\"chaosResult\":null}}},\"updatedBy\":\"YWRtaW4\"}"
What you expected to happen:
After ChaosInjected event emitted, the GetExperimentRun's executionData for ChaosEngine type should reflect the message and phase accordingly. For example, at least the message contains
(Truncated)
{\"name\":\"pod-network-loss-1kj\",\"phase\":\"ChaosInject\",\"message\":\"Injected pod-network-loss-experiment chaos on application pods\",\"startedAt\":\"1726772307\",\"finishedAt\":\"\",\"children\":null,\"type\":\"ChaosEngine\",\"chaosData\":{\"engineUID\":\"5467912e-c942-49ec-8754-3fceb552242e\",\"engineContext\":\"\",\"engineName\":\"pod-network-loss-1kjktwcs\",\"namespace\":\"chaos-test-namespace\",\"experimentName\":\"pod-network-loss\",\"experimentStatus\":\"initialized\",\"lastUpdatedAt\":\"1726772335\",\"experimentVerdict\":\"N/A\",\"experimentPod\":\"Yet to be launched\",\"runnerPod\":\"pod-network-loss-1kjktwcs-runner\",\"probeSuccessPercentage\":\"0\",\"failStep\":\"\",\"chaosResult\":null}}},\"updatedBy\":\"YWRtaW4\"}"
Where can this issue be corrected? (optional)
How to reproduce it (as minimally and precisely as possible):
I can reproduce on v3.9 and 3.10 by launching a simple network-loss/latency experiment and querying the GetExperimentRun API after actual fault is injected (helper pod is running)
Anything else we need to know?:
experimentStatus also doesn't seem to be very consistent, for example, sometimes, after fault is injected, the experimentStatus is Initialized, sometimes is Running and sometimes is empty (when sleep 1s after install-chaos-fault, not sure how that 's related).
+1 We are having the same issue. We would like to report to users of chaos experiment when the status changes to "chaos injected", but unable to do so as this info is not available via graphQL.
What happened: ChaosEngine Event ChaosInjected does not get Reflected in GetExperimentRun API. This netem (network-latency) experiment generates this ChaosInject event when fault is injected. However, I don't see this event being sent by this SendWorkflowUpdates function (by using additional logging locally), also phase and message doesn't get reflected in the GetExperimentRun API. Right now, the example below of ChaosEngine executionData excerpt doesn't contain anything about ChaosInject after fault has been actually injected.
What you expected to happen: After ChaosInjected event emitted, the GetExperimentRun's executionData for ChaosEngine type should reflect the message and phase accordingly. For example, at least the message contains
Where can this issue be corrected? (optional)
How to reproduce it (as minimally and precisely as possible): I can reproduce on v3.9 and 3.10 by launching a simple network-loss/latency experiment and querying the GetExperimentRun API after actual fault is injected (helper pod is running)
Anything else we need to know?: experimentStatus also doesn't seem to be very consistent, for example, sometimes, after fault is injected, the experimentStatus is Initialized, sometimes is Running and sometimes is empty (when sleep 1s after install-chaos-fault, not sure how that 's related).