chaostoolkit / chaostoolkit-addons

Chaos Toolkit addons (tolerances, controls) that can benefit everyone
https://chaostoolkit.org/
Apache License 2.0
4 stars 7 forks source link

If Safeguard probe is invalid, CTK just ignores it and continues running. #7

Open WixoLeo opened 3 years ago

WixoLeo commented 3 years ago

Here is an example of a simple experiment:


{
  "version": "1.0.0",
  "title": "test",
  "description": "test",
  "configuration": {
    "myarg": "exp.json",
    "another_arg": "another.json"
  },
  "tags": [
    "network",
    "lala"
  ],
  "controls": [
    {
      "name": "Safeguards",
      "provider": {
        "type": "python",
        "module": "chaosaddons.controls.safeguards",
        "arguments": {
          "probes": [
            {
              "name": "Non existing probe",
              "description": "Probe that doesn't exist",
              "type": "probe",
              "provider": {
                "type": "python",
                "module": "blablabla",
                "func": "bla",
                "arguments": {}
              },
              "tolerance": 2
            }
          ]
        }
      }
    }
  ],
  "steady-state-hypothesis": {
    "title": "Check something",
    "probes": [
      {
        "name": "My probe",
        "description": "My best probe",
        "tolerance": {
          "type": "regex",
          "pattern": "exp.json",
          "target": "stdout"
        },
        "type": "probe",
        "provider": {
          "type": "process",
          "path": "echo",
          "arguments": "${myarg}"
        }
      }
    ]
  },
  "method": [
    {
      "name": "My probe",
      "description": "My best probe",
      "tolerance": {
        "type": "regex",
        "pattern": "exp.json",
        "target": "stdout"
      },
      "type": "probe",
      "provider": {
        "type": "process",
        "path": "echo",
        "arguments": "${myarg}"
      }
    }
  ],
  "rollbacks": [
    {
      "name": "My probe",
      "description": "My best probe",
      "tolerance": {
        "type": "regex",
        "pattern": "exp.json",
        "target": "stdout"
      },
      "type": "probe",
      "provider": {
        "type": "process",
        "path": "echo",
        "arguments": "${myarg}"
      }
    }
  ]
}```

Notice the probe with module and function as "blabla"
Obviously it doesn't exists. But if you run the experiment, it just runs normally by ignoring that probe.
If I put the real probe there, it works fine.

I think the expected result would be some kind of an exception and abort everything.
We've been running experiments for a long time without knowing that we've had a non working probe.
Lawouach commented 3 years ago

Oh right. The ctk validator is unaware that this particular control uses probes to achieve its goal. So they are not validated.

I can't make the validator support that but I could bring the validation into the control itself, when we configure it.

WixoLeo commented 3 years ago

Oh right. The ctk validator is unaware that this particular control uses probes to achieve its goal. So they are not validated.

I can't make the validator support that but I could bring the validation into the control itself, when we configure it.

Sure, as long as it stops the experiment and throws an error, would be fantastic :)

Lawouach commented 3 years ago

Yeap. I can't work on this today but I'll try tomorrow, ping me otherwise :)

WixoLeo commented 3 years ago

Awesome thank you! @Lawouach

Lawouach commented 3 years ago

Hey @WixoLeo could you try the latest commit on master and see if that works?

WixoLeo commented 3 years ago

Hey @WixoLeo could you try the latest commit on master and see if that works?

Hi, sorry for late response, I tested it, and still the same problem for some reason

WixoLeo commented 3 years ago

Python:

__all__ = [
    "some_probe"
]

def some_probe():
    print("Hello mate")
    return True

exp.json


{
  "version": "1.0.0",
  "title": "test",
  "description": "test",
  "configuration": {
    "myarg": "exp.json",
    "another_arg": "another.json"
  },
  "variables": {
    "selected_instance": null
  },
  "tags": [
    "network",
    "lala"
  ],
  "controls": [
    {
      "name": "Safeguards",
      "provider": {
        "type": "python",
        "module": "chaosaddons.controls.safeguards",
        "arguments": {
          "probes": [
            {
              "name": "Non existing probe",
              "description": "Probe that doesn't exist",
              "type": "probe",
              "provider": {
                "type": "python",
                "module": "testoplugin.testo.probes1",
                "func": "some_probe",
                "arguments": {}
              },
              "tolerance": true
            }
          ]
        }
      }
    }
  ],
  "steady-state-hypothesis": {
    "title": "Check something",
    "probes": [
      {
        "name": "Non existing probe",
        "description": "Probe that doesn't exist",
        "type": "probe",
        "provider": {
          "type": "python",
          "module": "testoplugin.testo.probes",
          "func": "some_probe",
          "arguments": {}
        },
        "tolerance": true
      }
    ]
  },
  "method": [
    {
      "name": "My probe",
      "description": "My best probe",
      "tolerance": {
        "type": "regex",
        "pattern": "exp.json",
        "target": "stdout"
      },
      "type": "probe",
      "provider": {
        "type": "process",
        "path": "echo",
        "arguments": "${myarg}"
      }
    }
  ],
  "rollbacks": [
    {
      "name": "My probe",
      "description": "My best probe",
      "tolerance": {
        "type": "regex",
        "pattern": "exp.json",
        "target": "stdout"
      },
      "type": "probe",
      "provider": {
        "type": "process",
        "path": "echo",
        "arguments": "${myarg}"
      }
    }
  ]
}```
Lawouach commented 3 years ago

Found the reason why my fix wasn't enough. I'll have to update the ctk core lib and release it for this. On it.

Lawouach commented 3 years ago

Once this one is merged and released, i'll update this repo accordingly https://github.com/chaostoolkit/chaostoolkit-lib/pull/225

WixoLeo commented 3 years ago

Once this one is merged and released, i'll update this repo accordingly chaostoolkit/chaostoolkit-lib#225

Awesome, thank you @Lawouach, absolute pro!

Lawouach commented 3 years ago

Hey @WixoLeo I have updated the master branch with the changes from ctklib 1.20.0, feel free to give it a spin :)

WixoLeo commented 3 years ago

Hey @WixoLeo I have updated the master branch with the changes from ctklib 1.20.0, feel free to give it a spin :)

So I've tested it, and it seems like it's still not working.. I used ctk, ctk-lib and addons from master branch

Lawouach commented 3 years ago

Could you upload a ctk log file here and your experiment again?

WixoLeo commented 3 years ago

chaostoolkit.log

WixoLeo commented 3 years ago
{
  "version": "1.0.0",
  "title": "test",
  "description": "test",
  "configuration": {
    "myarg": "exp.json",
    "another_arg": "another.json"
  },
  "variables": {
    "selected_instance": null
  },
  "tags": [
    "network",
    "lala"
  ],
  "controls": [
    {
      "name": "Safeguards",
      "provider": {
        "type": "python",
        "module": "chaosaddons.controls.safeguards",
        "arguments": {
          "probes": [
            {
              "name": "Non existing probe",
              "description": "Probe that doesn't exist",
              "type": "probe",
              "provider": {
                "type": "python",
                "module": "bla",
                "func": "some_probe",
                "arguments": {}
              }
            }
          ]
        }
      }
    }
  ],
  "steady-state-hypothesis": {
    "title": "Check something",
    "probes": []
  },
  "method": [],
  "rollbacks": []
}
Lawouach commented 3 years ago

The log says:

[2021-09-01 13:47:08 DEBUG] [python:167] Control module '/Users/leonidh/Documents/repos/ctk-tests/venv/lib/python3.9/site-packages/chaosaddons/controls/safeguards.py' does not declare 'validate_control'

Maybe you're not updated properly?

WixoLeo commented 3 years ago

I've done a clean install, this happened:

(venv)  $ chaos run exp.json
zsh: /usr/local/bin/chaos: bad interpreter: /usr/local/opt/python/bin/python3.7: no such file or directory
[2021-09-01 14:53:05 INFO] Validating the experiment's syntax
Traceback (most recent call last):
  File "/Users/leonidh/Documents/repos/ctk-tests/venv/bin/chaos", line 33, in <module>
    sys.exit(load_entry_point('chaostoolkit', 'console_scripts', 'chaos')())
  File "/Users/leonidh/Documents/repos/ctk-tests/venv/lib/python3.9/site-packages/click/core.py", line 1137, in __call__
    return self.main(*args, **kwargs)
  File "/Users/leonidh/Documents/repos/ctk-tests/venv/lib/python3.9/site-packages/click/core.py", line 1062, in main
    rv = self.invoke(ctx)
  File "/Users/leonidh/Documents/repos/ctk-tests/venv/lib/python3.9/site-packages/click/core.py", line 1668, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/leonidh/Documents/repos/ctk-tests/venv/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/leonidh/Documents/repos/ctk-tests/venv/lib/python3.9/site-packages/click/core.py", line 763, in invoke
    return __callback(*args, **kwargs)
  File "/Users/leonidh/Documents/repos/ctk-tests/venv/lib/python3.9/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/Users/leonidh/Documents/repos/ctk-tests/chaostoolkit/chaostoolkit/cli.py", line 246, in run
    ensure_experiment_is_valid(experiment)
  File "/Users/leonidh/Documents/repos/ctk-tests/venv/lib/python3.9/site-packages/chaoslib/caching.py", line 76, in wrapped
    return f(**arguments)
  File "/Users/leonidh/Documents/repos/ctk-tests/venv/lib/python3.9/site-packages/chaoslib/experiment.py", line 94, in ensure_experiment_is_valid
    validate_controls(experiment)
  File "/Users/leonidh/Documents/repos/ctk-tests/venv/lib/python3.9/site-packages/chaoslib/control/__init__.py", line 121, in validate_controls
    validate_python_control(c)
  File "/Users/leonidh/Documents/repos/ctk-tests/venv/lib/python3.9/site-packages/chaoslib/control/python.py", line 111, in validate_python_control
    func(control)
  File "/Users/leonidh/Documents/repos/ctk-tests/chaostoolkit-addons/chaosaddons/controls/safeguards.py", line 211, in validate_control
    validate_probes(probes)
  File "/Users/leonidh/Documents/repos/ctk-tests/chaostoolkit-addons/chaosaddons/controls/safeguards.py", line 353, in validate_probes
    ensure_activity_is_valid(probe)
  File "/Users/leonidh/Documents/repos/ctk-tests/venv/lib/python3.9/site-packages/chaoslib/activity.py", line 43, in ensure_activity_is_valid
    ref = activity.get("ref")
AttributeError: 'str' object has no attribute 'get'

If I remove the probe and leave just the control in, it's going good

Lawouach commented 3 years ago

Good and bad. Good because it uses now the chaosaddon as expected, bad because that chaosaddon master is broken in your case. Let me look.

Lawouach commented 3 years ago

I think this is fixed in 98bcd4a17ef4075408ceeea67a772fcf420c76a4

WixoLeo commented 3 years ago

Amazing! It's working! Thank you very much! @Lawouach Absolute pro!