cilium / cilium

eBPF-based Networking, Security, and Observability
https://cilium.io
Apache License 2.0
19.03k stars 2.76k forks source link

Support migration between tunnel mode and direct routing mode #9680

Open joestringer opened 4 years ago

joestringer commented 4 years ago

Ask from Cilium Slack:

Any thoughts on how to do a migration from vxlan tunneling to direct L2 routing without segmenting during the migration. I have a demo Kubernetes cluster up with half the nodes on the old config (tunnel=vxlan), and the other half on with the new config (auto-direct-node-routes=true,tunnel=disabled). Packets from the new nodes to the old are routed and received by containers on the old nodes, but replies come back vxlan encapsulated and subsequently dropped by the new nodes which don’t have a vxlan interface. I think this could work if a) I could setup a receive only vxlan interface on the new nodes, or b) I could convince the old nodes to send unencapsulated packets to new nodes.

I propose that this could be achieved through a two-phase process:

At the core the new mode would be very similar to tunnel mode (for device configuration purposes etc.), but the difference would be that when learning about remote nodes, the tunnel map entries would not be pushed into the cilium_tunnel_map.

In this configuration, the tunnels would be configured and as nodes are migrated to the intermediate new mode, they will transition from sending traffic over the tunnel to sending traffic directly. The entire time, the receive side should remain configured to allow nodes that have not yet upgraded to continue to transmit traffic via the tunnel, and any arbitrary node will be able to receive that tunneled traffic.

Once the entire cluster is migrated to the new mode, no traffic should be transmitted via the tunnels any more. After rolling the whole cluster to this mode, then tunnel could be entirely disabled. (edited)

I expect that given the new mode, the reverse migration could also be achieved (DR -> tunnel).

Tasks:

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

rktidwell commented 4 years ago

I've been thinking about how to support something like this as well. Are you suggesting adding a third mode so that we have tunnel, direct, and "hybrid" mode (I don't have a better term)? That's my understanding of the proposal. I assume the idea is to minimize packet loss across the cluster when switching modes.

I'm brainstorming here, so bear with me. I'd love to explore this in more detail, I see toggling from tunnel to DR mode as something people will want to do. Strategies that eliminate (or at least mitigate) packet loss during this transition are needed IMO. The strategy you're describing seems to assume that cilium is working with a route table that is populated rather quickly so that pods sending traffic have it routed to the right destination. Convergence time with something like BGP isn't always fast, so unless cilium has hooks into a routing daemon you might still end up with an interim period of packet loss as your routing protocol converges. At that point, you might as well bite the bullet and just jump straight from tunnel->DR modes and accept a transient outage.

It may come across as a bit of a tangent, but hear me out. If cilium supported a native routing daemon instead relying on something out-of-band like kube-router, you would have a place for hooks that could be used for toggling when to tunnel a packet on egress. In addition, that has the benefit of ensuring a route has been received for the destination which will keep packets from either black-holing or going to the default route (which you may not want) while your routing protocol converges.

Again, I'm just brainstorming here. I'm thinking there may be a way to support this migration without a third mode of operation if you had hooks into a routing daemon. Thoughts?

joestringer commented 4 years ago

I assume the idea is to minimize packet loss across the cluster when switching modes.

That was the idea behind this issue, yeah.

I see toggling from tunnel to DR mode as something people will want to do.

I'm curious if you have more details on this. To my knowledge, most users will already have a cluster topology in place, be it on-prem or via a managed service which provides strong direction as to which of the modes they will select. There can be a subsequent architecture decision that could drive desire to switch (potentially implying need for migration path), do you have another case in mind for such a "hybrid" mode?

If cilium supported a native routing daemon instead relying on something out-of-band like kube-router, you would have a place for hooks that could be used for toggling when to tunnel a packet on egress.

I agree that to seamlessly transition, some co-ordination between Cilium and the routing agent needs to be performed. Ie Cilium would need to be react to routing being ready, to trigger the switch of the mode. There's multiple ways to implement this, whether by extending the Cilium API so that external routing daemons can call into Cilium; whether you have Cilium subscribe to some external notifications to reach; Cilium running some routing functionality internally; or driven by the user where they deploy the routing daemon side, then after convergence, trigger the change of mode in Cilium.

pchaigno commented 3 years ago

This issue is not particularly difficult to address but still a bit too complex for a good-first-issue IMO.

ufou commented 2 weeks ago

We currently run kube-router + kube-proxy in one of our main k8s clusters, all of our other clusters are running cilium 1.15 in direct routing / no tunnel mode - this is working nicely for us, we now wish to migrate the remaining main cluster over but we can't suffer downtime like the other clusters did when we switched from kube-router/proxy to cilium.

I have been running through the migration guide which works ok when using tunnel mode but when I change the migration config to use direct routing mode, connectivity begins to fall apart in strange (probably expected) ways.

So rather than migrating straight from kube-router/proxy to cilium in DR mode, I was hoping to first migrate to cilium in tunnel mode, then over to cilium DR mode - when I noticed this github issue I wondered if there had been any movement on a 'hybrid' solution? Otherwise it seems like I have to either live with cilium in tunnel mode or do a full stop migration with downtime, neither of which are desirable...