mtakaki / cachet-url-monitor

URL monitor plugin for cachethq.io
MIT License
124 stars 48 forks source link

Reset Status After Incident Resolved #122

Open joecan00 opened 3 years ago

joecan00 commented 3 years ago

How does one configure the yml file to have a HTTP Status trigger an incident but have the status revert back to Operational if the issue resolves itself.

Currently the way I have it setup if the docker container in question is shut down it will log an incident and move to Partial Outage. Restarting the docker resets the status to operational within the Incident report but service itself still shows Partial Outage.

For self-resolving situations can the status reset itself?

mtakaki commented 3 years ago

It does sound like you've encountered a bug here. It should be self-resolving when the URL is reachable again. If it's setting it to partial, then can you try increasing the response time threshold?

InFerYes commented 3 years ago

I keep getting this warning in the logs:

WARNING [2021-06-23 15:09:19,797] cachet_url_monitor.configuration.Configuration.Moodle - Incident update failed with status [404], message: ""

The response time threshold is set to 5.

  - name: Moodle
    url: https://moodle.cvoantwerpen.org
    method: GET
    timeout: 5
    expectation:
      - type: HTTP_STATUS
        status_range: 200-300
        incident: MAJOR
      - type: LATENCY
        threshold: 5
    allowed_fails: 5
    frequency: 30
    component_id: 5
    action:
      - CREATE_INCIDENT
      - UPDATE_STATUS
    public_incidents: true
    latency_unit: ms
mtakaki commented 3 years ago

@InFerYes foryour case it looks like the component_id may be incorrect, otherwise you wouldn't get a 404. Unless the cachet url is incorrect.