oar-team / cigri

http://cigri.imag.fr
1 stars 1 forks source link

The user should be able to configure automatic fixes #9

Open bzizou opened 9 years ago

bzizou commented 9 years ago

For example, in the JDL, we could have:

"action_on": { "timeout": "ignore|resubmit|blacklist", "walltime": "ignore|resubmit|blacklist" },

with ignore=fix the event, resubmit=fix the event and resubmit, blacklist=disable the cluster until manual fix

bzizou commented 9 years ago

for RUNNER_SUBMIT_TIMEOUT, we can also have a "retrieve" option: try to retrieve the submitted jobs by searching jobs submitted in the time interval with the name of the campaign... or tagging jobs (how?)