Fix setup when lbr0 doesn't exist but /run/openshift-sdn/docker-network is correct

dcbw commented 8 years ago

1) do a vagrant setup and get everything working with 'subnet' plugin 2) on a node, change /vagrant/openshift.local.config/node-openshift-node-1/node-config.yaml to multitenant 3) ip link del lbr0 4) restart openshift-node 5) fails with:

brctl addif lbr0 vlinuxbr bridge lbr0 does not exist! E1130 22:37:38.940693 16004 controller.go:39] Error executing setup script: exit status 1 F1130 22:37:38.940709 16004 start_node.go:271] SDN Node failed: Failed to start plugin: exit status 1

Which is because the thing that actually sets up lbr0 (the docker code) doesn't run if lbr0 isn't set up. Make that happen.

danwinship commented 8 years ago

We need to nail down setup_required/docker_network_config check better. If we want to support arbitrary scenarios of the form "admin fiddles with openshift's infrastructure behind its back and then expects things to work", then we have to just get rid of setup_required, and always fully destroy and recreate the network on startup.

(See also https://github.com/danwinship/openshift-sdn/commit/d93a8203, also discovered yesterday.)

sdodson commented 8 years ago

If we want to support arbitrary scenarios of the form "admin fiddles with openshift's infrastructure behind its back and then expects things to work", then we have to just get rid of setup_required, and always fully destroy and recreate the network on startup.

Would that mean we'd restart docker everytime as well? That'll break running the node inside a container with SDN support.

danwinship commented 8 years ago

Would that mean we'd restart docker everytime as well?

Yeah. I meant to imply that we didn't want to do this, so we need to figure out what we actually do support.

dcbw commented 8 years ago

Moving the setup script to Go + netlink would allow us to distinguish between EEXIST and other errors more easily. Just shelling out to brctl or /sbin/ip wouldn't let us do that. I did a bunch of work on the flannel OVS plugin branch to move the setup code to Go, so I could port that over to OSDN with the testcases and try to more closely follow the patterns you had in your setup.sh Go-ify branch.

danwinship commented 8 years ago

Moving the setup script to Go + netlink would allow us to distinguish between EEXIST and other errors more easily.

But we don't know what errors we want to distinguish. Eg, this patch makes it so we can recover from deleting lbr0, but we don't recover from deleting tun0. Why one and not the other?

OK, here's a specific proposal:

If /run/openshift-sdn/ doesn't exist, then we assume this is boot time, and we have to create everything from scratch (destroying any existing config files and sdn-related network interfaces first on the assumption that they're stale).
If /run/openshift-sdn/ does exist, then we assume this is a restart of a running node, and everything is already mostly set up correctly, and so we only want to fix things up for config changes:
- restart docker if ${mtu} has changed
- change lbr0/tun0 IP address if ${local_subnet_gateway} has changed
- change tun0 cluster network route and update /etc/openshift-sdn/config.env if ${cluster_network_cidr} has changed
- flush and recreate OVS flows if ${multitenant} has changed

Note that the "destroy everything" part of the first rule means that if you do mess up a running system by deleting lbr0 or whatever, you can do rm -rf /run/openshift-sdn; systemctl restart openshift-node to forcibly fix everything.

dcbw commented 8 years ago

I meant more that we could simply run all operations, or at least many of them in a large class, each time at startup. Doing that with the existing bash script means adding a lot of || true and pre-checks so we can distinguish between EEXIST and real errors. Same if we had to look at exit codes.

But doing it with netlink means we get the actual error code and we have a chance of figuring it out. Using any specific config as a marker of whether to set other parts up isn't really robust since they are all mostly independent. So if we want to really fix this, I think we just need to attempt to run almost all config setup every time, and selectively not run specific things that we don't need to. Netlink and useful error codes help that. Thoughts? I'm happy to work on that part.

danwinship commented 8 years ago

But is that actually useful? Is it worth adding all the extra checking-if-already-correct code, to handle cases that aren't even supposed to happen?

(And there's no plausible way we could non-destructively validate the OpenFlow rules...)

(btw, not arguing against netlink, just saying I don't think it's relevant here)

dcbw commented 8 years ago

Yeah, I don't think the OpenFlow rules are that worth checking individually. But the rest of the stuff probably is since they are easier to touch from outside, and we've run into this every few weeks. Besides OF rules we have:

1) ensure /run/openshift-sdn/docker-network has the right content 2) ensure docker is started with the right configuration (from /run/openshift-sdn/docker-network) 3) ensure lbr0 is created 4) ensure lbr0 has the right IP address 5) delete the subnet route on lbr0 6) ensure net.bridge.bridge-nf-call-iptables=0 is set 7) ensure br0 is created and has the right protocol 8) ensure vxlan0 is added to br0 and has right port number 9) ensure tun0 is added to br0 and has the right port number 10) ensure vlinuxbr/vovsbr are created, and that their txqueuelen=0 11) ensure vlinuxbr is added to lbr0 12) ensure vovsbr is added to br0 with the right port number 13) ensure tun0 has the right IP address 14) ensure there's a route for the cluster network through tun0 15) ensure docker0 doesn't exist 16) ensure IP forwarding is on

Any one of those could be wrong at startup. Some are less likely than others, but we've already made assumptions about what shouldn't change, and those have been wrong. Your proposal sounds fine to me, but I think it's too coarse-grained still.

If the argument that we should check these in a finer-grained manner sounds plausible to you, I'm further arguing that we should do that with netlink to make it even more robust, since then we don't have to deal with screenscraping, parsing errors, mis-types of the bash, etc, and we can use the netlink errors to more accurately determine when to re-configure and when not to.

danwinship commented 8 years ago

anyone else have opinions?

pravisankar commented 8 years ago

My 2 cents... User could do crazy things and can mess up the system. Every time trying to ensure setup is right during startup seems like an overkill. This will lead to slow bootup time due to additional checks and still user can fiddle after that. I like the simple approach where we do minimal checks that are essential and give the option to the user to do reset in case of problems. Approach suggested in https://github.com/openshift/openshift-sdn/pull/216#issuecomment-161016283 seems reasonable to me.

pravisankar commented 8 years ago

Can we close this PR in favor of https://github.com/openshift/openshift-sdn/pull/220 ? Later on, pkg/ipcmd and pkg/ovs can use direct libnfnetlink/libovsdb APIs instead of making cmdline calls.

dcbw commented 8 years ago

Yeah.

openshift / openshift-sdn

Fix setup when lbr0 doesn't exist but /run/openshift-sdn/docker-network is correct #216