Closed belfhi closed 1 year ago
Thank you @belfhi - this took me a full day to find.
after turning on debug logging (which is undocumented, but thankfully the code is readable) it looks like every erroneous rejection is handled twice
time="2023-05-31T13:03:09Z" level=debug msg="admit result: CREATE /v1, Kind=Namespace test/test user=u-8hjfn allowed=true err=<nil>"
time="2023-05-31T13:03:09Z" level=debug msg="admit result: CREATE /v1, Kind=Namespace test/test user=u-8hjfn allowed=false err=<nil>"
I have the same problem on multiple clusters. Why rancher-webhook install the latest version automatically? This update damage work in my company! I run a command in pipeline to add project into new namespace
kubectl annotate namespace ${KUBE_NAMESPACE} field.cattle.io/projectId="${KUBE_RANCHER_CLUSTER_ID}:${KUBE_RANCHER_PROJECT_ID}" --overwrite=true
We are impacted too by this issue on multiple clusters. Why rancher-webhook installs the latest version automatically?
We need to keep control on these updates.
rancher version: 2.7.3 kubernetes version: 1.24.10 (v1.24.10-rancher4-1) downstream cluster : rke1
@wirwolf : FYI, I opened a separated issue regarding the automatic update. https://github.com/rancher/webhook/issues/246
This update might have to do with a recent security problem and the 2.7.4 release of rancher yesterday. Is anyone able to check if the same bug exists on rancher 2.7.4 with rancher-webhooks 0.3.4?
Hi, yes, I upgraded my Rancher deployment to 2.7.4 this morning and can confirm that it works again! The question still remains why the helm chart was updated automatically in the first place to a disfunctional version?
I'm sorry that you ran into this.
This appears to be caused by how our agent image is built - unlike the core rancher image, it does not have the webhook version set, so it will pull in/deploy the latest version that it sees available in our charts repo. This coincided with our recent release of v2.7.4 to cause an upgrade for users on newer versions of Rancher.
As far as workarounds go, you can choose to manually downgrade the webhook version if you wish. If you do this, I would recommend updating the cattle-cluster-agent
deployment and adding the CATTLE_RANCHER_WEBHOOK_MIN_VERSION
env var set to the appropriate version (for 2.7.3 it would be 2.0.3+up0.3.3, for 2.7.2 it would be 2.0.2+up0.3.2). This will prevent the webhook from being upgraded to the 2.7.4 version (or re-upgraded if you previously downgraded this). However, this will not force a current webhook to be downgraded - you'll need to do that manually through helm or the rancher ui.
Keep in mind that if you apply this workaround you may need to remove this value after an upgrade to a future version containing a fix. If that turns out to be necessary, we will leave an update noting how to "reverse" the effects of this workaround.
I experienced rancher-webhook being upgraded to 0.3.5 in RMS 2.7.4 and this is my workaround.
# pin rancher-webhook to 2.0.4+up0.3.4
kubectl patch -n cattle-system deployment cattle-cluster-agent -p '{"spec":{"template":{"spec":{"containers":[{"name":"cluster-register","env":[{"name":"CATTLE_RANCHER_WEBHOOK_MIN_VERSION", "value":"2.0.4+up0.3.4"}]}]}}}}'
# get the revision of rancher-webhook-2.0.4+up0.3.4
helm history rancher-webhook -n cattle-system
# rollback to 0.3.4
helm rollback rancher-webhook <REVISION-NUMBER> -n cattle-system
The workaround didn't work for me. âšī¸ I'm having the issue where the NetApp Trident operator fails with status message Failed to install Trident; err: failed to patch Trident installation namespace trident; admission webhook "rancher.cattle.io.namespaces" denied the request: Unauthorized
. I was hoping this workaround would work in that case too, it seemed so similar...
I just want to check that it's not me missing something. Why would the CATTLE_RANCHER_WEBHOOK_MIN_VERSION
variable prevent the webhook from upgrading to a newer version? There's clearly something I'm not getting here.
EDIT 1: I upgraded to Rancher v2.7.5 instead and that resolved the issue! :)
EDIT 2: No, upgrading to v2.7.5 did not fix it. đ I looked at the wrong cluster. Damn it.
There are two problems here.
Downstream clusters would get automatically upgraded with newer versions for the webhook that did not match the version of cattle-agent. This should be fixed v2.7.5 https://github.com/rancher/rancher/issues/41730
Users are unable to create or update on some namespaces. This could be intentional based on the desired action performed on the namespace. We are working on adding more docs to this repo that will give more transparency to what we are actually validating which can be tracked here https://github.com/rancher/rancher/issues/41493 Below is the validation that Webhook currently performs on namespaces.
Note: The kube-system
namespace, unlike other namespaces, has a failPolicy
of ignore
on update calls.
Verifies that the annotation field.cattle.io/projectId
value can only be updated by users with the manage-namespaces
verb on the project specified in the annotation.
Validates that users who create or edit a PSA enforcement label on a namespace have the updatepsa
verb on projects
in management.cattle.io/v3
. See the upstream docs
for more information on the effect of these labels.
The following labels are considered relevant for PSA enforcement:
pod-security.kubernetes.io/warn-version
EDIT 2: No, upgrading to v2.7.5 did not fix it. đ I looked at the wrong cluster. Damn it.
@lindhe I believe this trident PR should fix your issue https://github.com/NetApp/trident/pull/840/files
@KevinJoiner I believe so too. đ¤Ŗ
@lindhe đ Didn't even notice
~@KevinJoiner I hit this problem, flux cannot update a namespace that it previously cloud. flux has cluster-admin but is still being denied. I can't add a cluster role to a project, so what's the mechanism to allow roles the ability to set project annotation on a namespace?~
Of course after looking at this for a while, found the problem about 10 seconds after I submitted this comment. Apologies. It was a misconfigured projectId that took a while to find that made it seem like a permissions issue.
As an update, a script to update this (and remove the update later on) can be found here: https://github.com/rancherlabs/support-tools/tree/master/adjust-downstream-webhook. Apologies for the late reply.
The upstream Rancher Cluster updated the downstream rancher-webhook chart to version
rancher-webhook-2.0.4+up0.3.4
which broke the creation of namespaces for non-admin users. The error message was the following:rancher version: 2.7.3 kubernetes version: 1.24.13
The workaround was to manually set the image tag back to 0.3.3