canonical / lxd

Powerful system container and virtual machine manager
https://canonical.com/lxd
GNU Affero General Public License v3.0
4.39k stars 932 forks source link

Use OVN force SNAT for load balancers (and not forwards) #10654

Open tomponline opened 2 years ago

tomponline commented 2 years ago

LXD OVN networks use OVN load balancers to implement both network forwards and network load-balancers.

Normally the external source address of the packets being forwarded is passed through into the instance (SNAT).

However for network load-balancers @stgraber has indicated that he would like the OVN router port's internal IP address on the LAN to be the source address (as this means that when/if we implement an haproxy load balancer then the source of the traffic won't change from providing the external address to not providing it).

In OVN this setting appears to be controllable using the lb_force_snat_ip on the logical_router table (https://manpages.ubuntu.com/manpages/jammy/en/man5/ovn-nb.5.html#load_balancer%20table). Setting it to "router_ip" appears to do what we want it to:

If it is configured with the value router_ip, then the load balanced packet is SNATed with the IP of router port (attached to the gateway router) selected as the destination after taking the routing decision.

However because this setting exists at the logical router level, it would also affect network forwards which we don't want it to.

To work around this the load_balancer table entries accept a skip_snat boolean which can be set to true to avoid performing SNAT. However because the LXD network forward feature already exists, we will need to apply post-apply this setting to each existing network forward entry before setting lb_force_snat_ip=router_ip.

This will require a patch in LXD to apply on startup to fix any existing network forwards.

tomponline commented 2 years ago

@stgraber do you want me to work on this before LXD 5.4 hits and OVN network forwards are released?

stgraber commented 2 years ago

@stgraber do you want me to work on this before LXD 5.4 hits and OVN network forwards are released?

Yeah, I'd prefer we have the expected SNAT behavior in 5.4.

tomponline commented 2 years ago

OK cool will work on tomorrow.

tomponline commented 2 years ago

I can't get lb_force_snat_ip to work (either by specifying the snat IP manually or using router_ip literal value), both of which should be valid according to the docs. It seems that on Jammy this option is either broken or intended for use in some scenario that we haven't got.

sudo ovn-nbctl --version
ovn-nbctl 22.03.0
Open vSwitch Library 2.17.0
DB Schema 6.1.0
tomponline commented 2 years ago

Issue opened upstream https://github.com/ovn-org/ovn/issues/144

tomponline commented 2 years ago

Updated upstream issue with additional reproducer and fresh OVN databases.

tomponline commented 2 years ago

@dceara upstream has confirmed that lb_force_snat_ip doesn't currently work with distributed routers and only works with gateway routers pinned to a specific chassis:

I see now why router_ip is ignored. It's because the option is only supported on gateway routers (bound to a chassis with options:chassis=): https://github.com/ovn-org/ovn/commit/c6e21a23bd8cfcf8dd8b6eb70c8b09e6f4582b2f

https://github.com/ovn-org/ovn/issues/144#issuecomment-1185413864

@dceara has suggested a possible change that would make that work with distributed routers too, but it will need a code change in OVN.

tomponline commented 2 years ago

Seems like this is going to be non-trivial to fix:

https://github.com/ovn-org/ovn/issues/144#issuecomment-1238391080