cni-migration is a CLI tool for migrating a Kubernetes cluster's CNI solution from Flannel (Canal) to Cilium. The tool works by running both CNIs at the same time using multus-cni. All pods are updated to attach a network interface from both CNIs, and then migrate each node to only running Cilium. This ensures that all pods are able to communicate to both networks at all times during the migration.
The following are the steps taken to migrate the CNI. During and after each step, the inter-pod communication is regularly tested using knet-stress, which will send a HTTP request to all other knet-stress instances on all nodes. This proves a bi-directional network connectivity across cluster.
node-role.kubernetes/canal-cilium=true
and
patch the canal DaemonSet to have a node selector on this label.node-role.kubernetes/cni-priority-canal=true
.node-role.kubernetes/cilium-canal=true
and writes its CNI config
to 99-cilium.conf
. This then runs on all nodes.node-role.kubernetes/cilium=true
and writes its CNI config
to 00-cilium.conf
. This will not run until a node is being migrated.node-role.kubernetes/cni-priority-canal=true
. This has a static
config that uses the Flannel CNI config for the main Pod IP network interface, and
the Cilium as an extra network interface attached. The resulting CNI config is
written to 00-multus.conflist
. This CNI config will be chosen by the Kubelet
until the node has been migrated.node-role.kubernetes/cni-priority-cilium=true
. This multus is the
same as the previous however swaps the primary Pod IP to that of Cilium
rather than Flannel.node-role.kubernetes/cni-priority-cilium=true
. This will change the priority
of the CNI on each cluster to Cilium and have each Pod IP be in Cilium's range.node-role.kubernetes/cilium-canal=true
.
The taint added uses the label node-role.kubernetes/cilium=true
which
terminates the first Cilium DaemonSet, replaced with the second. This second
Cilium DaemonSet writes its CNI config to 00-cilium.conf
which puts it as
the first CNI config to be selected and used by Kubelet, making this node now
only use Cilium CNI, rather than multus (Cilium and Canal).node-role.kubernetes/migrated=true
added which
signals that this node has been migrated.The cluster should now be fully migrated from Canal to Cilium CNI.
The following requirements apply in order to run the migration.
The cni-migration tool has input configuration file (default --config conifg.yaml
), that holds options for the migration.
This holds options on which label keys and shared value should be used for each signal of steps:
canal-cilium: node-role.kubernetes.io/canal-cilium
cni-priority-canal: node-role.kubernetes.io/priority-canal
cni-priority-cilium: node-role.kubernetes.io/priority-cilium
rolled: node-role.kubernetes.io/rolled
cilium: node-role.kubernetes.io/cilium
migrated: node-role.kubernetes.io/migrated
value: "true" # used as the value to each label key
The file paths for each manifest bundle:
cilium: ./resources/cilium.yaml
multus: ./resources/multus.yaml
knet-stress: ./resources/knet-stress.yaml
List of resources that must exist before beginning the migration.
daemonsets:
knet-stress:
- knet-stress
- knet-stress-2
deployments:
statefulsets:
List of resources which must be ready when checked throughout the migration before continuing:
daemonsets:
kube-system:
- canal
- cilium
- cilium-migrated
- kube-multus-canal
- kube-multus-cilium
- kube-controller-manager
- kube-scheduler
knet-stress:
- knet-stress
- knet-stress-2
deployments:
statefulsets:
List of resources which will be removed after completing the migration successfully:
daemonsets:
kube-system:
- canal
- cilium
- kube-multus-canal
- kube-multus-cilium
knet-stress:
- knet-stress
- knet-stress-2
deployments:
statefulsets: