litmuschaos / litmus

Litmus helps SREs and developers practice chaos engineering in a Cloud-native way. Chaos experiments are published at the ChaosHub (https://hub.litmuschaos.io). Community notes is at https://hackmd.io/a4Zu_sH4TZGeih-xCimi3Q
https://litmuschaos.io
Apache License 2.0
4.45k stars 698 forks source link

Failed to unmarshal chaosengine with multiple chaos experiments #3318

Open gbuirel opened 3 years ago

gbuirel commented 3 years ago

What happened: I'm trying to generate a workflow using basic chaos experiments (pod cpu hog & pod memory hog) and a prometheus Probe. When I run a workflow with only one chaos experiments, the workflow works fine. But when I try to run 2 chaos experiments ( let's say cpu-hog then memory-hog) I get the error "Error : failed to unmarshal chaosengine" even after the YAML validation says my yaml is fine.

When I run a workflow with any of the 2 experiments, it works, it's only when I try to run more than 1 experiment. I've tried running the queries in the observability panel, they work fine.

Error seems similar to https://github.com/litmuschaos/litmus/issues/2767 but I'm using the latest litmus version Litmus Version: 2.1.0 Build Time: 14 Sep 2021 17:09:00

Using chaosHub v2.2.x instead of v2.0

What you expected to happen: For a workflow to be able to run multiple experiments

How to reproduce it (as minimally and precisely as possible): Create a workflow using Chaos Hubs pod-cpu-hog & pod-memory-hog, with each their own prometheus probe, and try to run it.

Will join my workflow file

gbuirel commented 3 years ago

Workflow file is : workflow_2_chaosexp.txt

amityt commented 3 years ago

Hi @gbuirel , Thanks for reporting this issue. I see in the description that you are using Litmus 2.1.0, the above issue is due to these runProperties : probePollingInterval: "" initialDelaySeconds: "" These properties are of integer type, but while adding probes, they are are getting an empty string. As a workaround, after adding the probes you can edit the yaml and remove these two runProperties.

This issue has been fixed with version 2.2.0. If possible, you can upgrade your ChaosCenter to 2.2.0 and it should work fine.

gbuirel commented 3 years ago

Thanks for pointing that out, I edited the Yaml out and the workflow was able to start

I'm deploying Litmus with the latest Helm release, so I don't think I can upgrade the chaosCenter to 2.2.0 ?

spshuklaji commented 3 years ago

I have the following workflow: if i remove the third probe same workflow works but post addition of third probe it just gives the error:

Failed to create a new workflow. Error : failed to unmarshal chaosengine

probe.txt

Please let me know what is wrong with the workflow

LitmusChaos manifest file used is: kubectl apply -f https://raw.githubusercontent.com/litmuschaos/litmus/master/mkdocs/docs/2.2.0/litmus-2.2.0.yaml