[Bug] check_gitlab_scheduler failing if pipeline did not run

szEvEz commented 2 months ago

Description

Currently, when monitoring a ci schedule using check_gitlab_scheduler.py and the pipeline didn't run yet, the monitoring check fails with the following error:

File "/usr/lib/nagios/plugins/check_gitlab_scheduler.py", line 197, in main() File "/usr/lib/nagios/plugins/check_gitlab_scheduler.py", line 186, in main check_gitlab_scheduler( File "/usr/lib/nagios/plugins/check_gitlab_scheduler.py", line 66, in check_gitlab_scheduler status = last_pipeline["status"] TypeError: 'NoneType' object is not subscriptable

Reproduction steps

Create a gitlab-ci schedule
Add the schedule via id to icinga
See that the check starts failing

Current Behavior

Described above

Expected Behavior

Check doesnt fail if the pipeline didn't run yet

Additional information

No response

schurzi commented 2 months ago

Check doesnt fail if the pipeline didn't run yet

I'd say we should not emit a python error but some error staate seems reasonable, since there is no OK state present. What would you prefer? When the pipeline did not run we could emit a UNKNOWN to signal, that there is no state to show or we could signal a CRITICAL because there was no successful run.

rndmh3ro commented 2 months ago

IMO normally the scripts checks for errors in the pipeline-run. If the pipeline did not run, there's no error. So UNKNOWN sounds good to me.

The script can also check if the pipeline did not run at all for some time or is in pending state. Here I'd say a CRITICAL state could be warranted.

We could also make it configurable.

telekom-mms / monitoring-checks