Open thielj opened 5 months ago
Where you are entirely correct is that the current way we are communicating this and how our retry/.. logic for this monitor works is super weird.
[!NOTE] As context
PENDING
means that a monitor either
- has not had a push,
- has failed in the past and is currently retrying or
- is in some other transitionary step between
UP
orDOWN
(such as docker containers starting up).
=> I don't know how setting a monitor to PENDING
SHOULD behave. Our accounting around this is a bit messy and the behaviour is entirely undocumented. Likely this should skip one retry but behave as if DOWN
for other purposes, but unsure..
=> That setting a monitor to DOWN
does not trigger the correct retries is definitively a bug..
[!TIP] If you want to use the retry logic in the current system, you should instead not send a push => let the push-monitor time out and go into the
PENDING
-state independently
@CommanderStorm As I mentioned already, and others have mentioned before, letting something go into the pending state isn't really a solution if you have e.g. a job running once a day or are transitioning through a unit or container restart.
PENDING
for me is - in the context of push notifications at least - that something isn't fully up or completed yet, but due shortly and long before the regular period expires. Most important, if it doesn't come fully UP
within the retry period, I want it to be considered DOWN
and notified immediately.
Everything else either delays notifications unnecessarily or creates too many false positives.
It's not that different from something actively monitored by U-K, except that retries and retry periods just pass without actively retrying.
letting something go into the pending state isn't really a solution
You are explicitly setting it to PENDING
, so how can you not want the pending state??
I think something was left in translation here. Frank is confused ^^
It's not that different from something actively monitored by U-K, except that retries and retry periods just pass without actively retrying
I am going to repeat myself as i am 5% unshure if my last communication was clear (no offense intended, just trying to not mis-communicate ^^)
[!TIP] If you want to use the retry logic in the current system, you should NOT send a push in the interval. This lets the push-monitor time-out and go into the
PENDING
-state. The retry logic is triggered via this path.
📑 I have found these related issues/pull requests
🏷️ Feature Request Type
API / automation options, Change to existing monitor
🔖 Feature description
Sending
status=pending&msg=backup%20started
would immediately set the monitor to 'retry mode' without waiting for the usual period to expire.✔️ Solution
There have been several mentions in the past.
Another example is monitoring processes that occasionally exit or need to be deliberately stopped but are expected to be restarted and become available within the retry period (think systemd unit).
Or a daily job where I generally expect
UP
once a day. When the job is being started I sendPENDING
, after which the monitor would go into retry mode and expect theUP
to arrive within say 1h instead of waiting a full day before raising a notification (think remote job dying or stalling or blocking somehow).❓ Alternatives
For the above-mentioned examples, I couldn't find alternatives short of implementing my own "pending logic" somehow. None of these would add a suitable event record to Uptime Kuma either.
📝 Additional Context
No response