OneUptime / oneuptime

OneUptime is the complete open-source observability platform.
https://oneuptime.com
Apache License 2.0
4.82k stars 225 forks source link

Probes and Uptime #1639

Open AndersonOuverney opened 3 months ago

AndersonOuverney commented 3 months ago

Is your feature request related to a problem? Describe it. Yes, my request is related to a problem. We are testing the uptime monitoring of an IP. This IP has 6 probes, which means we can collect pings from 6 different locations at the same time. One of these locations had a problem and because of that, we had alerts related to downtime and our uptime was changed to 99.999% uptime, however, we did not have any downtime on the monitored object.

Describe the solution you would like Systems that perform uptime monitoring and have probe resources usually consider the downtime of more than one location to determine whether the monitored object is having a problem or if it is a problem with the probe route.

Describe the alternatives you considered In my opinion, we could determine conditions for the probes to define whether the status is online or offline. For example, in my case I have 06 probes, 3 on the same continent and 3 more on different continents. These last 3 that are on another continent may have more packet losses, since the internet is not as reliable as the other 3 that are on the same continent. We should have the option of combining the probes to determine the downtime. Example Probes 1,2,3,4 are having problems, in this case a downtime can be considered. The conditions for choosing 1,2,3,4 must be done directly in the monitor settings.

Additional context I believe that for ping and IP monitors, this condition is one of the most critical, it will provide ideal conditions for us to be able to determine the correct monitoring rules.

AndersonOuverney commented 5 days ago

I would like to provide a clearer example to solve this problem. There is a tool called hetrixtools.

We are still using it to monitor a few essential items. It has the combinations that we are reporting here. I believe it is worth taking a closer look at it with great care.

We are currently not using the full potential of oneuptime precisely because we consider that the uptime calculation is not correct.

simlarsen commented 5 days ago

This is on the roadmap. Will keep you posted.