OneUptime / oneuptime

OneUptime is the complete open-source observability platform.
https://oneuptime.com
Apache License 2.0
4.78k stars 223 forks source link

Offline criteria did not create incident when web is unavailable #703

Open JanRajnoha opened 1 year ago

JanRajnoha commented 1 year ago

Describe the bug Everytime when is my web unavailable, monitor will recognize correctly, that something is wrong, but incident is not created. Option to automatic resolve is unchecked.

To Reproduce Steps to reproduce the behavior:

  1. Set criteria to create incident when web is unavailable
  2. Make web unavailable (not manually trigger criteria)
  3. Incident is not created

Expected behavior Incident is created and is not resolved automatically, when web is again online

Screenshots image

Desktop (please complete the following information):

simlarsen commented 1 year ago

@JanRajnoha do you have the incident already active for that monitor? If so, we dont create another incident.

JanRajnoha commented 1 year ago

No, there is not any incident at all.

simlarsen commented 1 year ago

@JanRajnoha is this on a self hosted instance or SaaS? Can I have the monitor id?

JanRajnoha commented 1 year ago

It is SaaS and monitor ID is 4a7a3a97-ffc4-476c-9605-0ce276ce81c8

simlarsen commented 1 year ago

@JanRajnoha Thank you for this. Investigating. Will get back to you by EOD today.

simlarsen commented 1 year ago

@JanRajnoha You need to add Response Status Coide as a part of your criteria. You only have Is Online which will be true even if the server returns 500. As long as server responds to a request and does not timeout Is Online will be true.

JanRajnoha commented 1 year ago

I added check for status code != 200 and switch filters to Any, but nothing changed. Monitor is still switching to offline (as before), but incident is still not created

So I changed status from Offline to Degraded, to be sure that criteria is met and monitor switched to Degraded status -> everything is okay to pint with incident -> none has been created.

Both parts are saying "when this criteria is met". So if one is working, other has to too. image

JanRajnoha commented 1 year ago

Could you please look at it? @simlarsen

simlarsen commented 1 year ago

@JanRajnoha looking into it.

simlarsen commented 1 year ago

I dont think you added the response code filters. These are the only ones that show up:

image
JanRajnoha commented 1 year ago

I'm sorry, but you did not read my previous message.

I added check for status code != 200 and switch filters to Any, but nothing changed. Monitor is still switching to offline (as before), but incident is still not created

Or you want to say, that monitor will trigger (in current state) offline status, but not create incident? Monitor is working as should, but incidents are not created. I see in my monitor, that service is down like I should, but nothing else.

I'm simulating, that service is down currently. In one hour probably results will be visible in monitor, you could check AGAIN that nothing will happen.

JanRajnoha commented 1 year ago

Current state: image image

simlarsen commented 1 year ago

@JanRajnoha Can you please also share screenshot of criteria page?

JanRajnoha commented 1 year ago

image

JanRajnoha commented 1 year ago

It is working for offline too (changing status), but incident don't create

JanRajnoha commented 1 year ago

I tried different criteria: image Same result -> status changed correctly, but incident has not been created