Closed schrodit closed 5 years ago
Thanks @schrodit for the analysis and reporting the issue.
Thanks for reporting this issue. It took me a few tries, but I could reproduce the problem. When reducing the timeout of the webhooks, hibernating seems to work as expected. PR #239
Very well analyzed!
Due to compatibility issues of the teimoutSeconds
with older k8s versions, we decided to remove nodes
from resources to be registered as webook for now.
Description
When Karydia is deployed in a Gardener shoot cluster and that cluster is hibernated (worker nodes = 0), Karydia blocks wake up the node.
Karydia creates a validation and a mutating webhook which receives node events. These webhooks timeout when the kublet is trying to create a node because the webhooks point to the karydia deployment running in the same cluster which is not running (as there are no worker nodes to un the pod).
The karydia webhook has set its
failurePolicy
toIgnore
which should ignore the timeout when the Karydia is not ready. But as described in this issue https://github.com/kubernetes/kubernetes/issues/71508#issuecomment-470126655, the http call itself timeouts before the webhook reports back. Therefore, decreasing the timeout of the webhook could solve the problem.cc @vpnachev
Steps to reproduce
Expected behavior
Gardener wakeup is not blocked