Updating pods for ClusterNetworkCIDR change

danwinship commented 8 years ago

We now let you change ClusterNetworkCIDR, but currently every pod has a route to ClusterNetworkCIDR (via dev eth0). That means that if we change ClusterNetworkCIDR, we have to update every deployed pod too (or existing pods won't be able to reach pods in the newly-added space).

I can think of three options here:

Require node restarts after changing ClusterNetworkCIDR, and call "openshift-sdn-ovs Update" on every pod at startup if /run/openshift-sdn/config.env doesn't exist or has the wrong OPENSHIFT_CLUSTER_SUBNET value.
Add "WatchClusterNetworkCIDR()", and have nodes call "openshift-sdn-ovs Update" on every pod when it changes. (Although, we'd still need to do the startup check too, in case ClusterNetworkCIDR changed while the node was offline.)
Change things so that pods don't need to know ClusterNetworkCIDR. Meaning, change pod routing so that it uses the OVS bridge as a router rather than a switch, so the OVS rules would have to be updated to rewrite the eth_dst values of packets as it routed them. (This would also get rid of the need for cross-node ARPing.) Maybe too much change for the immediately-upcoming release?

@openshift/networking, thoughts?

pravisankar commented 8 years ago

Option-3 definitely gives a lot more flexibility and avoids restart of every pod. Any security implications? Can we ensure OVS to accept packets only that are intended to the given ClusterNetworkCIDR with this option?

danwinship commented 8 years ago

Option-3 definitely gives a lot more flexibility and avoids restart of every pod.

We wouldn't have to restart every pod in any of these cases. When I said "node restarts" in option 1, I just meant "systemctl restart openshift-node", not a reboot. And other than that, for the nodes themselves, we'd just need to run some "ip route" commands in each pod to adjust the routes.

But options 1 and 2 still require doing something to every pod any time ClusterNetworkCIDR changes, while option 3 does not.

Any security implications? Can we ensure OVS to accept packets only that are intended to the given ClusterNetworkCIDR with this option?

Sure, we can still filter in whatever ways we want

danwinship commented 8 years ago

Change things so that pods don't need to know ClusterNetworkCIDR. Meaning, change pod routing so that it uses the OVS bridge as a router rather than a switch, so the OVS rules would have to be updated to rewrite the eth_dst values of packets as it routed them.

Meh. Started implementing this but realized that it will break non-OpenShift-managed docker containers. (We don't notice when those containers are created, so we don't get to add OVS rules for them, so OVS won't know their MAC addresses and won't be able to rewrite packets addressed to them.)

Since we have to restart the master to resize the cluster anyway, I'm now leaning towards option 1, require restarting the nodes too, and have them fix up the pods at startup.

dcbw commented 8 years ago

Upstream Kube is leaning towards not allowing PodCIDR (eg, each node's slice of ClusterNetworkCIDR) changes either. When we start merging more with Kubernetes we'd have to either fix Kube to handle this (which upstream isn't going to do) or drop the OpenShift functionality that handles it. I vote we just don't do it in the first place, at least at runtime.

https://github.com/kubernetes/kubernetes/issues/19854

danwinship commented 8 years ago

fixed by #259

openshift / openshift-sdn

Updating pods for ClusterNetworkCIDR change #239