Open robwittman opened 1 year ago
@senthilrch In 9259975 the service account for the webhook server has been removed. When I helm upgrade
d from v0.10.0 to (my fork of) v0.10.0 with 9259975 cherry-picked, the webhook server deployment still contained the fields serviceAccount
and serviceAccountName
with the former value (used Helm v3.11.2). This happens because Helm patches the existing deployment manifest (unless using --force
), this does not remove field serviceAccount
(deprecated but kept in sync with serviceAccountName
) and K8s re-populates serviceAccountName
from serviceAccount
. Consequently, the new pod could not be started because the service account did not exist anymore.
There are two option to fix this:
Add the following to the Helm template:
# ensure helm upgrade deletes the formerly used fields
serviceAccount: ""
serviceAccountName: ""
Revert the removal of the service account. In general it's a good practice to use a dedicated service account instead of default
.
I suggest option 2.
@senthilrch any updates on this ?
Hi there. @senthilrch any updates related to the fix of this issue?
I'm using zarf
to deploy/re-deploy kube-fledged
packages, so, in concept, you can do a fresh installation of the helm
charts by removing the previous installation with zarf
and then creating a new installation package again with zarf
, that should be able to deploy everything related to kube-fledged
from zero, including the webhook server
. I'm having the same issue.
Thanks
I've found a simple workaround to this issue:
Add the following configs to the values.yaml
to disable the webhook server
and the validation webhook
# Disable webhook server and validation webhook
webhookServer:
enable: false
validatingWebhook:
# Specifies whether a validating webhook configuration should be created
create: false
This is probably not the best solution, but as I've seen in the code and also in the Make
file in the deploy-using-yaml
option, this is a very known issue and the validation is probably not 100% required.
status:
completionTime: "2023-08-01T14:59:51Z"
message: All requested images pulled succesfully to respective nodes
reason: ImageCacheCreate
startTime: "2023-08-01T14:59:42Z"
status: Succeeded
When the
kube-fledged
helm chart is redeployed, if the changes don't cause thewebhook-server
component to restart, anyImageCache
operations start failing withIt looks like this is because the webhook CA bundle is hardcoded in the helm chart, but when the webhook server is started,
init-server
generates a new CA bundle and updates the webhook configuration. When another deployment occurs, the original CA bundle is reapplied, and the webhook requests begin to fail, until the webhook component is restarted again to patch the bundleIs there a best practice for keeping that CA bundle configured appropriately? Would support for an existing Certificate secret make sense?
Steps to reproduce
Install base helm chart
Deploy a simple image cache
Update the helm chart, with a value that doesn't restart the webhook server
If you were to update the
ImageCache
above, the webhook errors are returned. After restarting the webhook component, they succeed again