canonical / juju-lint

Run checks against a juju model
GNU General Public License v3.0
0 stars 4 forks source link

Add "status age" configuration for unexpected status check #212

Closed zxhdaze closed 6 months ago

zxhdaze commented 6 months ago

Juju-lint checks for unexpected unit status (e.g. blocked, error, etc.). However it also does report on "executing" units. This can be helpful in case the unit has been stuck in this state for a while, but not for short periods of time since it is expected - causing false positives.

Juju already provides a timestamp for the transition to the current state, and juju-lint reports it on the error (status_since):

{
  "description": "Checks for unexpected status in juju and workload",
  "id": "status-unexpected",
  "message": "Juju on unit juju-lint/0 has status 'executing' (since: 2021-09-08T07:35:41.749245174Z, message: running action juju-run); (We expected: ['idle'])",
  "status_current": "executing",
  "status_msg": "running action juju-run",
  "status_since": "2021-09-08T07:35:41.749245174Z",
  "tags": [
    "status"
  ],
  "what": "Juju on unit juju-lint/0"
},

It would be helpful to add a "minimum-status-age' parameter with a reasonable default (e.g. 10mins) and have juju-lint report an error only if the threshold is exceeded to avoid false positives or short-term work. Probably only want to apply this to the "executing" status and not error/blocked/down/lost, etc.