opnsense / core

OPNsense GUI, API and systems backend
https://opnsense.org/
BSD 2-Clause "Simplified" License
3.27k stars 727 forks source link

NPTv6: add dynamic source addresses via track interface #5284

Closed bimbar closed 1 year ago

bimbar commented 2 years ago

Important notices

Before you add a new report, we ask you kindly to acknowledge the following:

Is your feature request related to a problem? Please describe.

The problem hear is small site multihoming. It's very nicely described here: https://blog.ipspace.net/2010/12/small-site-multihoming-in-ipv6-mission.html and can pretty much only be done with local ULA addresses and some sort of address translation toward the WAN interfaces. Right now it can only be done via NA(P)T66, which is the very same as NA(P)T44, and I don't think we want that anymore. Now the implementation problem with it is that with NAT you can do the following:

nat on WAN inet6 from SOME_IP_RANGE to any -> (WAN:0) port 1024:65535

while with NPT (binat) you have to specify a source and a destination network, so some sort of integration into interface newwanipv6 will be necessary.

Describe the solution you like

I would like a "Track Interface" External IPv6 Prefix Option in addition to static External IPv6 Prefix in NPTv6, pretty much just like it is in the interface configuration dialog. Specified would be:

Describe alternatives you considered

Oh so many ones, but none of them work. The best one in clean IPv6 terms so far is to use two independent routers each with their own GUA prefixes which the independently advertise into LAN, only that works only if RFC3484bis is implemented or asymmetric routing is achieved. The one that works is to just do it like in IPv4, with local addresses for LAN, and NA(P)T on the outside interface, but that's hardly the right solution.

Additional context

There's even a very nice RFC for that, namely https://datatracker.ietf.org/doc/html/rfc7157 "IPv6 Multihoming without Network Address Translation" but even they concede that NPTv6 is probably needed for now. For static prefixes this should work nicely with static NPTv6, but if the prefixes are dynamic, it falls down.

fichtner commented 1 year ago

@ivwang thanks for testing! this is correct if you use /64, but you can actually match the delegated prefix size on the WAN side, which would make the prefix ID "visible" if you use NPT on the full prefix size.

fichtner commented 1 year ago

(the ULA might have to match the prefix ID part of the address in order for this to work, I'm not entirely sure)

ivwang commented 1 year ago

@fichtner Oh. thanks for the reminder. I think this patch works and I am happy now :)

BTW, for people using fc00::/7 ULA prefixes don't forget to allow bogus networks on the internal interface or the auto-generated rule will block those.

Thanks again.

fichtner commented 1 year ago

Nice, thank you. Let’s ship this on 22.7.8 then :)

christiankratzer commented 1 year ago

Are we clear that this does not have to be ULA. We could have any valid prefixes on the LAN from which to nat.

The source can be anything. It does not have to be ULA.

fichtner commented 1 year ago

Yes, if you have two prefixes this can make sense. But the internal one needs to be static in one way or another, which is the case for 99% of the users waiting for this. 😊

christiankratzer commented 1 year ago

Yes the internal prrefixes are static which is exactly my use case. Just wanted to know that nobody hard coded any other assumptions like them having to be ULA.

My use case is having stable internal prefixes for various vpn tunnels.

fichtner commented 1 year ago

Sure, makes sense. There is a possibility to also auto-detect the internal prefix (using an interface selection most likely), but we want to see how auto-detect works in practice first.

RehaagJ commented 1 year ago

I have been testing this in a Multi-WAN environment, with mixed results.

Two WAN interfaces, both dynamic (one FTTH on ix0, one DSL on pppoe0). Both are getting /56 prefixes assigned, both get GUA interface addresses assigned that are outside the assigned prefix ranges. There are files ix0_prefixv6 and pppoe0_prefixv6 in /tmp with the correct /56 prefixes.

The internal networks are using /64 ULAs now.

When I use /64 in the NPTv6 rule, the DSL provider routes the packets OK, and I can see the prefix from the interface’s GUA combined with the lower 64 bits of the ULA in test-ipv6.com, just like ivwang describes above. Using /56 in the NPTv6 rule, test-ipv6.com says that I don’t have an IPv6 address, and traceroute6 does not reply anything. Using packet capture, I can see that the prefix ID is indeed used, the translated address now consists of the first 56 bits of the interface’s GUA, followed by the lower 72 bits of the ULA. The FTTH provider does not route anything in both cases (both /64 and /56 on the NPTv6 rule create addresses like described above, but that provider seems to only route the /128 GUA). Both fail in test-ipv6.com.

I suppose the problem could be solved by using the delegated prefix that can be seen in the /tmp/ix0_prefixv6 and /tmp/pppoe0_prefixv6 files instead of the interface’s GUA prefix when doing the translation. Would that be possible to change, or add as an option?

christiankratzer commented 1 year ago

Using the delegated prefix instead of the wan interfaces GUA is definetely the right thing to do. That is what the delegated prefix is intended for.

fichtner commented 1 year ago

I’m unwilling to embed an explicit prefix into the rules for obvious reasons at this stage. Depending on the ISP and network gear you get the PD net on WAN or not and contributors to this issue have proven to be nonexistent 🤷‍♂️

fichtner commented 1 year ago

Fixed as in literal as in explicit as per edit. -.-

RehaagJ commented 1 year ago

Sorry, misread your comment (fixed vs explicit), so I deleted a comment. What contribution would be needed?

fichtner commented 1 year ago

Some sort of sensible progression of gradual improvement and ideas. ;) I suppose we can pick up the prefix from a LAN tracking it. I don’t want to reach for the file or ifconfig as that will be stale after a reload and cause wonkiness or the rule to disappear because the address is not available.

christiankratzer commented 1 year ago

The whole idea of prefix delegation is to give the consumer address space to use. The broadband consumer will generally not be able to use anything but the /128 of the prefix on the WAN interface. It is the dynamic delegated prefix that should be used for any kinds of prefix nat.

This is of course also the difficulty as the delegated prefix does not sit on any interface from where it could be picked easily.

There needs to be a trigger that updates the rules when the delegated prefix changes.

I know too little of how opnsense sets these things up to be able to recommend a solution. Trying to help though.

RehaagJ commented 1 year ago

Fully agree about the file, I mentioned it just to point out that the information exists somewhere. And indeed, it's not a good source. The trigger mentioned in the previous post does seem to exist already? Local interfaces using WAN tracking are already being updated when the WAN changes (not sure what exactly triggers the update, is it newwanip?). Anyway, that could be a starting point to update the rules also. I also don't know enough about the architecture, but I would like to help as well. For now, that'll be testing...

fichtner commented 1 year ago

@christiankratzer not sure what you mean. The prefix is effectively on each tracking LAN, but with a /64. however, we do have the prefix size hint from the WAN configuration. I don’t see an issue here.

@RehaagJ the filter might not reload for varying reasons and also PPPoE can be a factor of further difficulty wedged in between. So we can’t rely on the ruleset to be reloaded at all times/at the perfect times. We spent a lot of effort to remove such problems from the generated pf.conf rules over the years actually.

christiankratzer commented 1 year ago

@fichtner There are several prefixes involved in this context

The prefix of the WAN interface which is not relevant for this exercise. Generally the only usable address out of the lan prefix is the single /128 on the interface.

The various statically configured prefixes of the LAN interfaces which are always /64.

The dynamic prefix (IA_PD) that is delegated from the ISP through DHCPv6 or PPPoE ( typically a /56 or /48 ).

The task would be to NAT the LAN various LAN prefixes to a subnet of the delegated prefix.

LAN1:: nat to IA_PD:1:: LAN2:: nat to IA_PD:2:: LAN3:: nat to IA_PD:3::

christiankratzer commented 1 year ago

@fichtner

on this:

@christiankratzer not sure what you mean. The prefix is effectively on each tracking LAN, but with a /64. however, we do have the prefix size hint from the WAN configuration. I don’t see an issue here.

The whole idea of this use case is to not allow the dynamic delegated prefixes to propage to the LAN interfaces so that one can statically configure them to ULA or other prefixes.

This is why we need prefix NAT to be dynamic to the assigned prefix.

I get the challenge in keeping this in sync. I remember somebody writing that pfsense had a similar feature. We should investigate what they have and how they implemented it.

fichtner commented 1 year ago

I still understand the issue but not the challenge after getting a milestone done here. I don’t see a problem improving this as required by certain PD deployments. And you don’t want to look what pfSense did ;)

christiankratzer commented 1 year ago

The question I have is does the curent milestone nat to the prefix of the WAN interface or to the assigned prefix ?

What I am trying to say is that natting to the WAN prefix instead of the delegated prefix would not serve any purpose as only the /128 would be usable. That would require ipv4 style hide nats instead of a prefix nat.

I am not aware of any residential deployments where the user could use any iP other than the assigned IP on the WAN prefix.

My current setup is fully static so I am unable to test immediately. I need to build a lab setup for this with some VM to fully test.