opnsense / core

OPNsense GUI, API and systems backend
https://opnsense.org/
BSD 2-Clause "Simplified" License
3.27k stars 727 forks source link

Allow WAN Interface to be enabled/disabled based on CARP Status #7333

Closed shehzman closed 2 weeks ago

shehzman commented 6 months ago

Important notices

Before you add a new report, we ask you kindly to acknowledge the following:

Is your feature request related to a problem? Please describe.

Currently, if a user wants to setup a passive/active router setup (high availability) on the WAN interface, they will need 3 static IP addresses on the WAN side so they can setup CARP. Due to the shortage of IPV4 addresses and many ISP's not even distributing an IPV6 addresses yet, it may be very expensive or outright impossible for some users to obtain 3 static WAN IP addresses. Especially in residential environments, such as a homelab, where ISP's typically distribute only a single dynamic IP address per line.

Describe the solution you like

Add an option to enable/disable the WAN interface based on the CARP status of a selected interface. Very similar to the Depend On (CARP) option in Wireguard instances and this custom script. This way, users can have single and or dynamic IP addresses and still achieve a high availability OPNSense setup. On the interfaces/WAN page, maybe add a dropdown that allows the user to select which CARP interface the WAN will depend on. If the CARP status is master, the WAN is enabled. If it is backup, WAN will be disabled. Also, if applicable, a way for the router currently in backup mode to access the internet. Maybe similar to how the custom script is able to accomplish this.

Describe alternatives you considered To achieve high availability on the WAN side, a user can add another router in front of the OPNSense instances and configure the DMZ settings accordingly. However, I would say this adds too much complexity and defeats the purpose of a high availability router setup.

Additional context

N/A

OPNsense-bot commented 6 months ago

Thank you for creating an issue. Since the ticket doesn't seem to be using one of our templates, we're marking this issue as low priority until further notice.

For more information about the policies for this repository, please read https://github.com/opnsense/core/blob/master/CONTRIBUTING.md for further details.

The easiest option to gain traction is to close this ticket and open a new one using one of our templates.

skl283 commented 6 months ago

Just cameto this topic - I am currently in the process of implementing exactly this with the custom script you mentioned - it would be very nice if opn has the Option to enable/disable the WAN interface based on the CARP status of a selected interface.

Since there is for the PPOE Connection Type already an trigger to connect/disconnect connections i assume this should be absolutely in scope to extend the functionality to disable an interface or to select more than one Interface to disable more than just one Interface.

image

@shehzman i'll try to use the following edited script with more logging to archive this: https://gist.github.com/willjasen/6ae0f47bca36ced2bd52b2fefc2bc21e

oasis9 commented 5 months ago

I'm very interested in this use case. I don't need instant/seamless failover but my ISP forces me to use DHCP even though I have a static reservation. I need OPNsense to disable WAN on the backup and only enable it when the backup enters CARP master state. Someone wrote a script but it keeps getting broken by updates. Would be a lot nicer if this was natively supported.

willjasen commented 5 months ago

The script I have seems to be working for me on the latest versions of OPNsense, but I am a sample of 1. In the way that it operates, it will only enable WAN interfaces once all CARP subsystems are in the MASTER state and it will disable all WAN interfaces if even one CARP subsystem is not MASTER. For some further clarification, the colloquial use case is for WAN interfaces but the script makes no distinction other than an array of interfaces to be toggled as described above.

oasis9 commented 5 months ago

@willjasen I'm a bit busy today but when I get time I'll test your solution, looks like it's just what I need. Will report back if it works for me 😊

skl283 commented 5 months ago

@AdSchellevis @fichtner could you please look at this Feature Req.?

Please look also at my Post: https://github.com/opnsense/core/issues/7333#issuecomment-2027884872 i think that this req. is fully at OPNsense‘s scope and perhaps not too difficult to implement?

fichtner commented 5 months ago

I think modifying config.xml state on CARP state changes is neither desirable nor wise.

Cheers, Franco

fichtner commented 5 months ago

And before I forget...

they will need 3 static IP addresses on the WAN side so they can setup CARP

You can get by with just one CARP IP alias and IPv4 set to "none".

shehzman commented 5 months ago

I think modifying config.xml state on CARP state changes is neither desirable nor wise.

Cheers, Franco

Is this not how you modify the WireGuard interface when the CARP state changes if the user uses the Depend on (CARP) option? That’s pretty much what I’m requesting but for the WAN interface. This is coming from a place of ignorance as I know next to nothing about the inner workings of OPNSense’s code base so please correct me if I’m wrong.

shehzman commented 5 months ago

And before I forget...

they will need 3 static IP addresses on the WAN side so they can setup CARP

You can get by with just one CARP IP alias and IPv4 set to "none".

Wouldn’t the single CARP IP alias need to be static? Otherwise it sounds like you’d have to change it every time a dynamic WAN IP changes.

fichtner commented 5 months ago

Some community wireguard carp scripts have done this perhaps, but it's not good to modify configuration state and the core code supporting this does it differently for reasons mentioned.

CARP is always static?!

Cheers, Franco

shehzman commented 5 months ago

Some community wireguard carp scripts have done this perhaps, but it's not good to modify configuration state and the core code supporting this does it differently for reasons mentioned.

CARP is always static?!

Cheers, Franco

Sorry I’m a little confused. The Depend On (CARP) feature I’m referring to is an official feature in OPNsense for the WireGuard interface and not a community script. One of my OPNsense setups has a proper CARP setup with this feature. If I power off of my primary OPNsense instance or manually set the CARP status to backup, the WireGuard interface on the primary instance will be disabled and the WireGuard interface on the secondary OPNsense instance will be enabled automatically. WireGuard will then be disabled on the secondary instance and enabled on the primary instance when it has a CARP status of master. Is there something specific about the WAN interface that wouldn’t allow an existing feature like the one I just highlighted to work there too?

From my understanding, CARP/VRRP in general does not support dynamic IP addresses.

fichtner commented 5 months ago

Stopped, not disabled. Runtime condition (CARP state) causes runtime modification of service (stop and start accordingly).

To illustrate what I mean from the OP's script:

https://gist.github.com/spali/2da4f23e488219504b2ada12ac59a7dc#file-10-wancarp-L25-L26

Runtime value causes configuration change causes a reload wich could cause runtime value change which could cause configuration change which ...

shehzman commented 5 months ago

Stopped, not disabled. Runtime condition (CARP state) causes runtime modification of service (stop and start accordingly).

To illustrate what I mean from the OP's script:

https://gist.github.com/spali/2da4f23e488219504b2ada12ac59a7dc#file-10-wancarp-L25-L26

Runtime value causes configuration change causes a reload wich could cause runtime value change which could cause configuration change which ...

Ahh I see. Thanks for the clarification. Looks like the proper way to have a high availability WAN interface is to have 3 static WAN IP addresses or just manually enabling/disabling the WAN interface on each of your OPNsense systems. You could also have a single CARP alias and set the WAN IPv4 to none like you mentioned, but I think you need to change the CARP alias IP every time your WAN IP changes. Unless it’s possible to write a script to automatically do that.

Little bit of a bummer cause I run an OPNsense VM on a Proxmox node at home. Planning to buy a second node to backup all my VM/LXC’s and use as a backup OPNsense instance. Unfortunately, my residential ISP only distributes a single dynamic IPv4 address so I have no options for automatic WAN failover.

fichtner commented 5 months ago

If you use CARP on WAN you will definitely have a static address even if hidden behind a NAT somewhere... I'm not sure why this confuses you the way it does.

shehzman commented 5 months ago

If you use CARP on WAN you will definitely have a static address even if hidden behind a NAT somewhere... I'm not sure why this confuses you the way it does.

Maybe it’s just a gap in my networking knowledge as I’m very much still a beginner at all this (mainly a dev). From what I know about CARP/VRRP, it requires at least 3 static IP addresses. One for each system and one for the shared IP address. On the WAN side, if you don’t have 3 static IP addresses, the only workaround I know and have seen other people use is to put a router (typically the one provided by the ISP) in front of your two OPNsense instances and configure the DMZ on it to point to the shared IP address. Not the biggest fan of this approach since it requires an extra router and can add extra complexity to a setup with a double NAT (even though port forwarding is taken care of with the DMZ).

Maybe there’s another workaround I’m not familiar with or the router behind OPNsense solution isn’t as bad as I think it is, but that’s where I’m coming from in terms of my current knowledge.

fichtner commented 5 months ago

Ok no problem. For most purposes you don't need CARP on WAN really. This is mainly a workaround for suboptimal WAN access (like one DHCP account only) or when you want traffic to flow in on a HA fashion (but there are better technologies for it like OpenVPN multi home). In the bulk of use cases you want HA for your internal devices only. This is the main "router" case. The other one includes "server" side use and associated complications (that e.g. reverse proxies try solve).

shehzman commented 5 months ago

Ok no problem. For most purposes you don't need CARP on WAN really. This is mainly a workaround for suboptimal WAN access (like one DHCP account only) or when you want traffic to flow in on a HA fashion (but there are better technologies for it like OpenVPN multi home). In the bulk of use cases you want HA for your internal devices only. This is the main "router" case. The other one includes "server" side use and associated complications (that e.g. reverse proxies try solve).

Ahh I see. I personally want CARP or some kind of automated failover on the WAN so that I can shutdown/reboot my main Proxmox node when performing updates/hardware upgrades and it won’t take down my internet access for others in the house. It’s not an absolute necessity but definitely a nice to have.

fichtner commented 5 months ago

Don't you have some sort of VPN to terminate on one of the boxes? Then you really don't need the CARP on the WAN side.

CoMPaTech commented 5 months ago

The gist which was referred somewhere is something I use(d) to some extend (something else fails so not actively using it unfortunately). Where the set-up is to indeed have a single DHCP option on the WAN side, but requiring HA on hardware level. The full disclosure there is; two DHCP WAN providers for optimal breakout and CARP for deciding the primary OPNsense box to DHCPclient and get the WAN addresses. That way if either OPNsense has a hardware issue or any of the WAN providers has an issue, all is still well. (Using routing metrics and gateway monitoring etc. for the WAN dual homing). This requires the interface change though since one of these providers fixes by mac address, not client identifier.

oasis9 commented 5 months ago

My ISP gives me a single reserved IPv4 address and an IPv6 prefix but neither of these work without renewing DHCP lease every 30 minutes, otherwise they interrupt my WAN connection until I do a new DHCP handshake. So I can't use CARP on WAN but why should I not be able to benefit from failover of some sort? I don't care if my connections stop for a few seconds, a few seconds is better than having to manually enable WAN on my backup firewall. If modifying config.xml isn't the right way to do this, what is? I really like the look of @willjasen's script, though I've made a comment on their gist about a call to interface_configure after unsetting the enable key in the interface's config, since this seems to prevent interface_configure from doing anything meaningful. I've got time to test this out today so I'll try their script out without those invocations and see how I go. Was wondering if interface_reset might be necessary but I'll admit I have next to no idea what best practice is here. Would be nice if this functionality was built in like CARP is. Knowing that we can make this work using a script, why shouldn't it be implemented officially?

skl283 commented 5 months ago

thank you @fichtner for your comments on this Feature Request.

I also want to have an Solution for having an redundancy in Hard- & Software with 2 opn Machines like @CoMPaTech and @oasis9 and the others have already mentioned. For dialup Devices this is already implemented - like i posted above:

image

it "only" lacks of doing DHCP Request on WAN sides only at 1 (the Master) opn-Machine! This should be absolutely in scope to extend the existing functionality to doing only at the Master Box DHCP Request, disable an interface or to select more than one Interface to disable. With the STANDARD implementation of WG now there is also a solution built in which also works nice!

With this Custom Scripts its not the best solution so there is a need for bringing this to good and working level.

Please @fichtner @AdSchellevis and other look at this struggle to bring a good solution.

br sash

garryevanson99 commented 4 months ago

Not sure if something has changed but this is now working again for me :)

https://gist.github.com/spali/2da4f23e488219504b2ada12ac59a7dc

fmeppo commented 3 months ago

I've been running with scripts like this on my routers for a couple weeks now. With minor fixes this provides reasonably robust backup, but failover takes ~30 sec (as the backup router must re-enable the interface, re-acquire the DHCP lease, etc.) There's usually a noticeable hiccup when failover occurs for WAN traffic - but there isn't any for my local networks as these are all covered by CARP.

I'm wondering if a more robust solution would be to enable CARP on my WAN interface, claim a static IP equal to what DHCP hands out, then use a dhclient exit hook to reconfigure the virtual IP address on both primary and backup routers whenever the lease changes. Obviously dhclient's default behaviors (directly changing the interface, updating resolv.conf, etc.) would need to be prevented for that interface. Has anyone tried such an approach?

OPNsense-bot commented 2 weeks ago

This issue has been automatically timed-out (after 180 days of inactivity).

For more information about the policies for this repository, please read https://github.com/opnsense/core/blob/master/CONTRIBUTING.md for further details.

If someone wants to step up and work on this issue, just let us know, so we can reopen the issue and assign an owner to it.

LorenKeagle commented 6 days ago

This is needed in a supported fashion for anyone doing multi-WAN failover with two DHCP WAN connections. In my case, I have cable internet, and a cellular backup.

I've tried several of the scripts available online, but they all either didn't work correctly in the first place, or stopped working when the underlying method signatures have changed.

Automatically enabling/disabling WAN interfaces based on CARP status seems like a no-brainer solution for those of us that do not have static IPs, but still want failover if a primary router goes down or needs to be reset for maintenance.

oasis9 commented 6 days ago

I apologise for my lack of input, I would benefit from this as a feature but I struggle to understand the inner workings enough to confidently make these sorts of changes without the input of others who have had the time and knowledge to create the solutions we have so far. It's still going to be several months before I can even start thinking about switching to a provider with true static routing (I can't get anything other than a DHCP reservation right now, so can't do CARP). I'd appreciate if this could be incorporated into OPNsense itself, even if there is a short noticeable failover. I've been using it to run my homelab and I can imagine many others have similar setups that would benefit from this functionality. I don't think this issue should be closed.