metal-stack / gardener-extension-provider-metal

Implementation of the gardener-extension-controller for metal-stack
MIT License
24 stars 11 forks source link

Add tolerations to DaemonSets. #268

Closed Gerrit91 closed 1 year ago

Gerrit91 commented 1 year ago

Otherwise, when users configure node taints, these components are misscheduled.

For example:

...
  Conditions:                                                                                                                                                                                                                                 
    Last Transition Time:  2022-10-05T07:19:56Z                                                                                                                                                                                               
    Last Update Time:      2022-10-05T07:19:56Z                                                                                                                                                                                               
    Message:               DaemonSet "kube-system/node-init" is unhealthy: misscheduled pods found (1)                                                                                                                                        
    Reason:                DaemonSetUnhealthy                                                                                                                                                                                                 
    Status:                False                                                                                                                                                                                                              
    Type:                  ResourcesHealthy                                                                                                                                                                                                   
    Last Transition Time:  2022-09-19T10:41:18Z                                                                                                                                                                                               
    Last Update Time:      2022-09-19T10:41:18Z                                                                                                                                                                                               
    Message:               All resources are applied.                                                                                                                                                                                         
    Reason:                ApplySucceeded                                                                                                                                                                                                     
    Status:                True                                                                                                                                                                                                               
    Type:                  ResourcesApplied                                                                                                                                                                                                   
  Observed Generation:     1        
...
majst01 commented 1 year ago

Explanation of the annotation: https://github.com/gardener/gardener/pull/4873

What happens if the ServiceAccount is not there, or not valid anymore ???

Gerrit91 commented 1 year ago

We excluded the metallb resources from the gardener-resource-manager webhook to continue using static tokens for them. Otherwise we run into the problem that when rolling the speaker DaemonSet, the VPN connection will break for shoots with single nodes and then for continuing the rollout the pod cannot be started because the kube-apiserver cannot reach the gardener-resource-manager webhook anymore --> leads to shoot with broken VPN and MetalLB broken.