The current Stats Pusher stat_values.py calls sys.exit() if one of the HTTP calls fails. However, this kills the whole script, meaning that all subsequent metrics are no longer collected. This in turn means many false alerts get fired.
The loop that goes through the checks should continue on to the next one if there is a problem. There is already a catch-all Exception handler at the loop level, so the best plan would seem to be to raise the Exception up the chain rather than swallow it locally.
The current Stats Pusher
stat_values.py
callssys.exit()
if one of the HTTP calls fails. However, this kills the whole script, meaning that all subsequent metrics are no longer collected. This in turn means many false alerts get fired.https://github.com/ukwa/ukwa-monitor/blob/cc9f9b0d26e0fad0b0202f9a488f6b1d0c698e40/stat-pusher/script/stat_values.py#L28-L37
The loop that goes through the checks should continue on to the next one if there is a problem. There is already a catch-all Exception handler at the loop level, so the best plan would seem to be to
raise
theException
up the chain rather than swallow it locally.