Open bboreham opened 4 years ago
Read weaveworks/weave#2911 and saw the weave reset
command.
Note that weave reset
only works with Docker. Currently wksctl
is installing Docker, but should probably move to containerd or something else. It would be nice not to add a new dependency when trying to solve this problem.
In other words, we should create something similar to weave reset
but with smaller footprint more tightly aimed at the Weave-Net-on-Kubernetes case.
Got it. Thank you, Bryan.
Reading the weave reset
logic that we should replicate.
In other words, we should create something similar to weave reset but with smaller footprint more tightly aimed at the Weave-Net-on-Kubernetes case.
What are you expecting to see here? I'm guessing it's a new Kubernetes specific command like, weave kube-reset
?
I'll call this command tentatively weave kube-reset
.
What I should do is to add this command before wksctl installing weave-net addon.
How to validate if this command is going to work correctly?
Checking that the weave
bridge get deleted?
What I should do is to add this command before wksctl installing weave-net addon.
Note the addon is installed once for the cluster, whereas we want to do this 'reset' action for every node, even if the cluster has been running for a month.
Remove weave container (might be specific to Docker?)
Yes, exactly, any 'weave container' would be managed by Kubernetes.
rm -f $HOST_ROOT/var/lib/weave/weave-netdata.db rm -f $HOST_ROOT/var/lib/weave/weavedata.db
I think one of these is ancient.
destroy bridge
And other devices - datapath
etc.
ip link del for all interface name = v${CONTAINER_NAME}pl
I guess if the code is there already. They should disappear when the owning containers disappear.
I would probably also expect it to delete the CNI config and binaries.
Maybe remove iptables rules?
Roughly, I'm finding that the following codes might work:
kube-reset)
rm -f $HOST_ROOT/var/lib/weave/weave-netdata.db >/dev/null 2>&1 || true
rm -f $HOST_ROOT/var/lib/weave/weavedata.db >/dev/null 2>&1 || true
destroy_bridge
for LOCAL_IFNAME in $(ip link show | grep v${CONTAINER_IFNAME}pl | cut -d ' ' -f 2 | tr -d ':') ; do
ip link del ${LOCAL_IFNAME%@*} >/dev/null 2>&1 || true
done
# require ALL_CIDRS
collect_cidr_args "$@"
shift $CIDR_ARG_COUNT
for CIDR in $ALL_CIDRS ; do
if ip addr show dev $BRIDGE | grep -qF $CIDR ; then
ip addr del dev $BRIDGE $CIDR
delete_iptables_rule nat WEAVE -d $CIDR ! -s $CIDR -j MASQUERADE
delete_iptables_rule nat WEAVE -s $CIDR ! -d $CIDR -j MASQUERADE
delete_iptables_rule filter WEAVE-EXPOSE -d $CIDR -j ACCEPT
fi
done
;;
This new weave kube-reset
command might contain delete_iptables_rule
for WEAVE.
wdyt?
The current problem is that I need to obtain the CIDR used by the current installation. Not sure what's the best way to obtain that CIDR - without calling weave
as we cannot expect weave-net binary to be running.
The above codes still need tweaking as I still don't totally understand all variables there. Some might be specific to Docker, for example ${CONTAINER_NAME}.
Weave Net uses a Linux bridge device, which will get an IP address assigned from the pod IP range. If you do something like remove a node from one cluster and add it to another, the bridge may retain an IP address, and that address could now duplicate the IP of a pod or another bridge.
This will cause weird failures as arp resolves the IP to one or other device arbitrarily.
Maybe we could have a command to clear down Weave Net on the node at install time, a bit like
kubeadm reset
? See also https://github.com/weaveworks/weave/issues/2911