cea-hpc / milkcheck

Highly parallel and flexible service manager.
Other
23 stars 6 forks source link

Provide require and filter only timeout hosts #31

Open wilfriedroset opened 8 years ago

wilfriedroset commented 8 years ago

The require dependancy could use an option to filter timed out hosts. This way all services can be run on responding host.

For example, we could use milkcheck to run several benchmark (one at a time). With the current behaviour (require), the benchmarking will stop on all hosts if at least one host is not OK. With a require on succes (rc=0), the benchmarking will stop on the host not ok. In this case we could still run the remaining benchmarck

cedeyn commented 3 years ago

Hi Wilfried, Can you confirm that using filter dependency is a good way to resolve this issue ?

---
services:
  timedout:
     desc: "check hosts are here"
     target: "vortex[1-10],good[1-10]"
     actions:
        'launch':
           cmd: echo Ready to launch benchmark

  bench:
     desc: "Launch Bench"
     target: "vortex[1-10],good[1-10]"
     filter: timedout
     actions:
         'launch':
           cmd: echo Launching Bench
wilfriedroset commented 3 years ago

It has been a long time since I've played with milkcheck. According to your sample and assuming that timedout is a reserved word which will be used to remove timedout node from the previous steps, then yes this could resolve the issue.

cedeyn commented 3 years ago

Nop, timedout is the first declared service in the configuration. In fact, the previous example should have a timeout option:

---
services:
  timedout:
     desc: "check hosts are here"
     target: "vortex[1-10],good[1-10]"
     timeout: 30
     actions:
        'launch':
           cmd: echo Ready to launch benchmark

  bench:
     desc: "Launch Bench"
     target: "vortex[1-10],good[1-10]"
     filter: timedout
     actions:
         'launch':
           cmd: echo Launching Benchbench:
     desc: "Launch Bench"
     target: "vortex[1-10],good[1-10]"
     filter: timedout
     actions:
         'launch':
           cmd: echo Launching Bench

The idea is to launch a first service that will check the connection with targeted nodes. The second service bench will effectively launch the bench (thanks to the filter property).