rancher-sandbox / cluster-api-addon-provider-fleet

Cluster API Add-on Provider for Fleet will auto register child clusters with https://fleet.rancher.io/.
Apache License 2.0
2 stars 4 forks source link

Fleet agent deployment should tolerate `node.kubernetes.io/not-ready` taint #74

Closed Danil-Grigorev closed 4 months ago

Danil-Grigorev commented 4 months ago

Fleet agent is installed in the cluster once it is connectable from the management side. Sometimes this means that CNI is not installed, or will not be installed, or it is the fleet agent responsibility to supply the CNI bundle into the cluster.

To do so, fleet agent should tolerate NotReady taint set on worker nodes:

  - effect: NoSchedule
    key: node.kubernetes.io/not-ready
…
  - lastHeartbeatTime: "2024-07-22T07:34:52Z"
    lastTransitionTime: "2024-07-22T07:34:51Z"
    message: 'container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady
      message:Network plugin returns error: cni plugin not initialized'
    reason: KubeletNotReady
    status: "False"
    type: Ready
manno commented 4 months ago

See https://github.com/rancher/fleet/pull/2659

Danil-Grigorev commented 4 months ago

After investigation it seems that tolerations are not sufficient in this case (they are already present ATM). What is needed, is a way to avoid hitting NetworkReady=false condition in kubelet by setting the fleet agent to use hostNetwork setting. This functionality is proposed in https://github.com/rancher/fleet/pull/2659