kubeovn / kube-ovn

A Bridge between SDN and Cloud Native (Project under CNCF)
https://kubeovn.github.io/docs/stable/en/
Apache License 2.0
1.93k stars 438 forks source link

[Feature Request] High Availability for vpc nat gateway #4415

Open zhangzujian opened 4 weeks ago

zhangzujian commented 4 weeks ago

Description

Implement high availability for vpc nat gateway, including iptables snat and eip.

We can:

  1. Run a DaemonSet instead of StatefulSet;
  2. Use lease to do a leader election;
  3. The leader pod is responsible for announcing eip via ARP/NDP or BGP;
  4. Use lsp options requested-chassis to change port binding.

Who will benefit from this feature?

No response

Anything else?

No response

bobz965 commented 3 weeks ago
  1. Run a DaemonSet instead of StatefulSet;

Do you mean to run DaemonSet pod on each node?

SkalaNetworks commented 3 weeks ago

Would running a deployment not be lighter? In any case, we need to have the content of the netfilter synced to avoid disruption when a node/nat gw pod crashes. If not possible, it will still be "HA", but there's gonna be TCP connection breakages on node/pod failures.

My use case is a NAT GW that can be "migrated" between nodes, the same way you'd migrate a VM. This is useful to drain a node for maintenance for example. Also right now, if a node crashes with a natgw on it, the nat gw will never come back because of statefulset mechanics in Kubernetes.

zhangzujian commented 3 weeks ago

In any case, we need to have the content of the netfilter synced to avoid distruption when a node/nat gw pod crashes. If not possible, it will still be "HA", but there's gonna be TCP connection breakages on node/pod failures.

Currently we have no ability to sync conntrack entries between pods, so after a node/pod failure existing, connections will be broken and reconnection is needed.

SkalaNetworks commented 3 weeks ago

Could we use a tool like conntrackd and only enable it on "handovers" when we want to switch traffic from one nat-gw to another? This mode could work great for migrations. Another mode could be implemented with permanent synchronization for very-HA usages where random crashes of nodes/pods should not cause any breakage.

zhangzujian commented 3 weeks ago

Could we use a tool like conntrackd and only enable it on "handovers" when we want to switch traffic from one nat-gw to another? This mode could work great for migrations. Another mode could be implemented with permanent synchronization for very-HA usages where random crashes of nodes/pods should not cause any breakage.

conntrackd seems ok.

SkalaNetworks commented 3 weeks ago

How do we ensure compatibility with "old" gateways? Rollout a new API version?