fact-project / shifthelper

So we can sleep at night.
4 stars 0 forks source link

webinterface offline, problem for SH? #249

Closed dneise closed 6 years ago

dneise commented 7 years ago

exception when webinterface is down.

9/28/2017 9:28:16 AM2017-09-28 07:28:16,283 - ERROR - custos.checks.FactIntervalCheck - wrapped_check - Exception while running check
9/28/2017 9:28:16 AMTraceback (most recent call last):
9/28/2017 9:28:16 AM  File "/opt/conda/lib/python3.6/site-packages/custos/checks/__init__.py", line 82, in wrapped_check
9/28/2017 9:28:16 AM    self.check(*args, **kwargs)
9/28/2017 9:28:16 AM  File "/opt/conda/lib/python3.6/site-packages/shifthelper/checks.py", line 39, in check
9/28/2017 9:28:16 AM    self.message_from_docs(self.checklist)
9/28/2017 9:28:16 AM  File "/opt/conda/lib/python3.6/site-packages/shifthelper/checks.py", line 44, in message_from_docs
9/28/2017 9:28:16 AM    level=message_level(self.name),
9/28/2017 9:28:16 AM  File "/opt/conda/lib/python3.6/site-packages/shifthelper/checks.py", line 129, in message_level
9/28/2017 9:28:16 AM    result_if_no_alerts=False,
9/28/2017 9:28:16 AM  File "/opt/conda/lib/python3.6/site-packages/shifthelper/checks.py", line 160, in all_recent_alerts_acknowledged
9/28/2017 9:28:16 AM    alerts = get_alerts()
9/28/2017 9:28:16 AM  File "/opt/conda/lib/python3.6/site-packages/retrying.py", line 49, in wrapped_f
9/28/2017 9:28:16 AM    return Retrying(*dargs, **dkw).call(f, *args, **kw)
9/28/2017 9:28:16 AM  File "/opt/conda/lib/python3.6/site-packages/retrying.py", line 212, in call
9/28/2017 9:28:16 AM    raise attempt.get()
9/28/2017 9:28:16 AM  File "/opt/conda/lib/python3.6/site-packages/retrying.py", line 247, in get
9/28/2017 9:28:16 AM    six.reraise(self.value[0], self.value[1], self.value[2])
9/28/2017 9:28:16 AM  File "/opt/conda/lib/python3.6/site-packages/six.py", line 686, in reraise
9/28/2017 9:28:16 AM    raise value
9/28/2017 9:28:16 AM  File "/opt/conda/lib/python3.6/site-packages/retrying.py", line 200, in call
9/28/2017 9:28:16 AM    attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
9/28/2017 9:28:16 AM  File "/opt/conda/lib/python3.6/site-packages/shifthelper/tools/__init__.py", line 35, in get_alerts
9/28/2017 9:28:16 AM    return alerts.json()
9/28/2017 9:28:16 AM  File "/opt/conda/lib/python3.6/site-packages/requests/models.py", line 885, in json
9/28/2017 9:28:16 AM    return complexjson.loads(self.text, **kwargs)
9/28/2017 9:28:16 AM  File "/opt/conda/lib/python3.6/json/__init__.py", line 354, in loads
9/28/2017 9:28:16 AM    return _default_decoder.decode(s)
9/28/2017 9:28:16 AM  File "/opt/conda/lib/python3.6/json/decoder.py", line 339, in decode
9/28/2017 9:28:16 AM    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
9/28/2017 9:28:16 AM  File "/opt/conda/lib/python3.6/json/decoder.py", line 357, in raw_decode
9/28/2017 9:28:16 AM    raise JSONDecodeError("Expecting value", s, err.value) from None
9/28/2017 9:28:16 AMjson.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

I was testing the HBM, for this I set the webinterface offline. This should have resulted in a call from HBM, since it cannot see SH (this one happened) and a call from the SH since it cannot see the HBM, this one did not happen.

It was prevented from happening, I think, because the of exception above.


It was a design goal, to make the SH independent from the webinterface .. apparently we did not reach it.

dneise commented 7 years ago

I tried it again, and again I was not called. However, I think I did not copy the correct exception before. The exception above comes from a check ... that is fine, exceptions in checks are caught, and lead to developer calls.

But the exception below comes from a Notifier.

9/28/2017 9:44:50 AM2017-09-28 07:44:50,136 - ERROR - custos.Custos - run - FactTwilioNotifier failed to handle message
9/28/2017 9:44:50 AMTraceback (most recent call last):
9/28/2017 9:44:50 AM  File "/opt/conda/lib/python3.6/site-packages/custos/__init__.py", line 64, in run
9/28/2017 9:44:50 AM    notifier.handle_message(message)
9/28/2017 9:44:50 AM  File "/opt/conda/lib/python3.6/site-packages/shifthelper/notifiers.py", line 107, in handle_message
9/28/2017 9:44:50 AM    self._remove_acknowledged_and_old_calls()
9/28/2017 9:44:50 AM  File "/opt/conda/lib/python3.6/site-packages/shifthelper/notifiers.py", line 48, in _remove_acknowledged_and_old_calls
9/28/2017 9:44:50 AM    alerts = {a['uuid']: a for a in get_alerts()}
9/28/2017 9:44:50 AM  File "/opt/conda/lib/python3.6/site-packages/retrying.py", line 49, in wrapped_f
9/28/2017 9:44:50 AM    return Retrying(*dargs, **dkw).call(f, *args, **kw)
9/28/2017 9:44:50 AM  File "/opt/conda/lib/python3.6/site-packages/retrying.py", line 212, in call
9/28/2017 9:44:50 AM    raise attempt.get()
9/28/2017 9:44:50 AM  File "/opt/conda/lib/python3.6/site-packages/retrying.py", line 247, in get
9/28/2017 9:44:50 AM    six.reraise(self.value[0], self.value[1], self.value[2])
9/28/2017 9:44:50 AM  File "/opt/conda/lib/python3.6/site-packages/six.py", line 686, in reraise
9/28/2017 9:44:50 AM    raise value
9/28/2017 9:44:50 AM  File "/opt/conda/lib/python3.6/site-packages/retrying.py", line 200, in call
9/28/2017 9:44:50 AM    attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
9/28/2017 9:44:50 AM  File "/opt/conda/lib/python3.6/site-packages/shifthelper/tools/__init__.py", line 35, in get_alerts
9/28/2017 9:44:50 AM    return alerts.json()
9/28/2017 9:44:50 AM  File "/opt/conda/lib/python3.6/site-packages/requests/models.py", line 885, in json
9/28/2017 9:44:50 AM    return complexjson.loads(self.text, **kwargs)
9/28/2017 9:44:50 AM  File "/opt/conda/lib/python3.6/json/__init__.py", line 354, in loads
9/28/2017 9:44:50 AM    return _default_decoder.decode(s)
9/28/2017 9:44:50 AM  File "/opt/conda/lib/python3.6/json/decoder.py", line 339, in decode
9/28/2017 9:44:50 AM    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
9/28/2017 9:44:50 AM  File "/opt/conda/lib/python3.6/json/decoder.py", line 357, in raw_decode
9/28/2017 9:44:50 AM    raise JSONDecodeError("Expecting value", s, err.value) from None
9/28/2017 9:44:50 AMjson.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
9/28/2017 9:44:50 AM2017-09-28 07:44:50,201 - ERROR - custos.notify.http - notify - Could not post message
9/28/2017 9:44:50 AMTraceback (most recent call last):
9/28/2017 9:44:50 AM  File "/opt/conda/lib/python3.6/site-packages/custos/notify/http.py", line 43, in notify
9/28/2017 9:44:50 AM    ret.raise_for_status()
9/28/2017 9:44:50 AM  File "/opt/conda/lib/python3.6/site-packages/requests/models.py", line 928, in raise_for_status
9/28/2017 9:44:50 AM    raise HTTPError(http_error_msg, response=self)
9/28/2017 9:44:50 AMrequests.exceptions.HTTPError: 503 Server Error: Service Unavailable for url: https://shifthelper.app.tu-dortmund.de/alerts
dneise commented 7 years ago

Should be fixed by 0e85612 (<- is currently running for testing in Dortmund)