opnsense / core

OPNsense GUI, API and systems backend
https://opnsense.org/
BSD 2-Clause "Simplified" License
3.36k stars 754 forks source link

Multiple monitor IPs for gateway monitoring #7163

Closed deajan closed 3 months ago

deajan commented 9 months ago

Is your feature request related to a problem? Please describe.

So this one quite bugs me since I use OPNsense, and probably goes even back to the time I used m0n0wall ^^

Monitoring a gateway and switching to an alternative gateway depending on a single condition isn't always reliable. Example: I use the change the monitor IP of the gateways to a "far IP", ie outside of the ISP adresses, which allows to really make sure internet is reachable, more far than next just next hop. Example: Monitor cloudflare DNS, OpenDNS, or a cloud provider IP or whatever service that usually responds 24h/24 to pings.

The problem is that those services can stop responding to ping for a couple of minutes from time to time for various reasons, like maintenance, BGP routing, peering changes... When this happens, OPNsense will mark the gateway as down and execute rules accordingly, even if internet still works but it's just the monitored IP that stopped responding.

So of course I could just monitor the remote gateway IP, but this would not be a good enough guarantee for internet to work. Some cases where "one can reach the remote gateway but not internet" happened, happen and will happen again. This could be even more true for home users where OPNSense is behind a non bridged router, but that's not my scope for this FR.

Describe the solution you like

It would be really really nice if there could be more than one monitor IP. In that case, one could monitor IP 1 and IP 2 and if one of them responds, then the gateway is supposed to work.

Ideal solution would be:

Describe alternatives you considered

A set of external scripts that could hook to the gateway monitoring could also be used. Eg, instead of monitoring IP 1.1.1.1, we'll execute a script which results would determine whether gateway is up or not. Since we're talking scripts, a user has full freedom of choice of what conditions he wants to be met for a gateway to be considered up. Downside: No nice gateway quality graphs, even if we could normalize the script output to be dpinger compatible to workaround that downside.

Additional context

Another (super dream) wish would be to allow adding TCP pings, ie mark gateway down if IP:PORT is unreachable, which would make OPNSense suitable for gateway changes when SaaS software / cloud provider / whatever http service isn't available via a gateway.

I know that OPNSense uses dpinger which only does ICMP pings, but it would be so nice to have added TCP support, with tools like nmap for example (or even nping which is part of nmap).

rmundel commented 4 months ago

Would be nice something like Fortigate does for SD-WAN monitoring.

They call Performance SLA: image image

I think dpinger limits what could be done it this regard but the ability to use multiple dpinger instances with different parameters for each gateway would improve things a lot.

The issue we have with the current approach is to detect gateway failures when the ISP throttles the internet due to payment issues. Usually they keep allowing icmp and domain resolution and blocks all tcp/udp traffic outside they're own network.

When this happens to the secondary link, which usually is kept on Tier 2, we only find out when it's needed. Then is too late.

Maybe use monit with some scripts to achieve this? Maybe it's already possible even?

Wireheadbe commented 4 months ago

Multiple IP's to monitor on a single gateway would be a welcome feature indeed.. 👍🏻

djatwork commented 4 months ago

+1

deajan commented 3 months ago

AFAIK, dpinger only has ICMP support. Priror to dpinger, arpinger was used, but it seemed buggy at the time, hence the replacement.

It seems that dpinger has multiple target support, see this, although I didn't see anything in the release notes. There have been only a few bugfix releases in the last 7 years, so probably no new features. arpinger's page is even worse, project has been put in archive mode 9 years ago.

So there's plenty room for improvement for gateway monitoring in OPNSense. I've made an FR here for dpinger.

OPNsense-bot commented 3 months ago

This issue has been automatically timed-out (after 180 days of inactivity).

For more information about the policies for this repository, please read https://github.com/opnsense/core/blob/master/CONTRIBUTING.md for further details.

If someone wants to step up and work on this issue, just let us know, so we can reopen the issue and assign an owner to it.

deajan commented 3 months ago

Wow !