Open lindhe opened 1 year ago
We're running into the same issue after upgrading from Rancher 2.6.11 to 2.7.5. I can confirm that your workaround fixes the issue.
@lindhe: Thanks for bringing this up and creating the corresponding pull request. I can confirm as well, that this solves the issue in my cluster.
Does NetApp has a plan to merge this at some point in time? Applying these workarounds in automation is a bit cumbersome and unclean.
We're still seeing the same issue in Rancher 2.7.9 and Trident 23.10.0. Can we perhaps get an update from Netapp on this issue and the pending PR?
@nheinemans-asml Could you try with v24.10.0? It's apparently resolved there, but I have no idea which PR that was.
@lindhe I tested it with Rancher v2.9.2 and trident 24.10.0 is still an issue. After applying the workaround it suceeds:
kubectl describe torc trident
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Installing 16m trident-operator.netapp.io Installing Trident
Warning Failed 3m45s (x6 over 16m) trident-operator.netapp.io Failed to install Trident; err: failed to patch Trident installation namespace netapp-trident; admission webhook "rancher.cattle.io.namespaces" denied the request: Unauthorized
Normal Installed 27s trident-operator.netapp.io Trident installed
Hi @betweenclouds This should have been fixed in 24.10.0 as part of https://github.com/NetApp/trident/commit/5824103a201cb2f1be13f9435e554ad160c829b3
Can you try setting the forceInstallRancherClusterRoles: true in helm/trident-operator/values.yaml
@sjpeeris Thank you, with forceInstallRancherClusterRoles=true
the installation is sucessful, but only if I create a namespace named trident
. Is this a expected behavior?
works:
helm install netapp-trident netapp-trident/trident-operator --version 100.2410.0 --create-namespace --namespace trident --set tridentDebug=true --set forceInstallRancherClusterRoles=true
does not work:
helm install netapp-trident netapp-trident/trident-operator --version 100.2410.0 --create-namespace --namespace netapp-trident --set tridentDebug=true --set forceInstallRancherClusterRoles=true
edit:
Namespace is hard-coded here: https://github.com/NetApp/trident/blob/master/helm/trident-operator/templates/clusterrolebinding-rancher.yaml#L13
instead of a variable like here: https://github.com/NetApp/trident/blob/master/helm/trident-operator/templates/clusterrolebinding.yaml#L10
Hi @betweenclouds, you are correct. That namespace shouldn't be hard-coded. We will have this fixed in the next release. Thanks for pointing that out.
Describe the bug
When installing the Trident operator from the Helm chart in a Kubernetes cluster managed by Rancher, the operator fails because it is unable to add the PSA label
pod-security.kubernetes.io/enforce: privileged
on its installation namespace. This is because Rancher has a special admission webhook in place for setting PSA labels, which must be granted to the ServiceAccount, on top of all the other RBAC rules it needs.Environment
helm install trident netapp-trident/trident-operator --version 23.04.0 --create-namespace --namespace trident
To Reproduce
helm repo add netapp-trident https://netapp.github.io/trident-helm-chart
helm install trident netapp-trident/trident-operator --version 23.04.0 --create-namespace --namespace trident
Check the status of the installed CRDs, the
trident
TridentOrchestrator object and the pods deployed:Expected behavior
I expect it to deploy as it should and not crash. Here's an example of what it looks like when deploying successfully:
Additional context
This was already reported to Rancher's GitHub page as issue #41191. People (understandably) thought that this was a bug in Rancher, while it's more of a documentation issue on their part (in my opinion).
There's also some information available in the operator's pod logs. I don't have them easily available right now, but it basically amounts to the same message as the one displayed by the TridentOrchestrator object anyway; it fails to patch the
trident
namespace because the Rancher admission webhookrancher.cattle.io.namespaces
denied the request (Unauthorized).Work-around
Inspired by this comment from the issue reported to Rancher's GitHub page, applying the following manifest and then restarting the operator fixes the issue: