Closed kky-fury closed 2 months ago
Hello 👋
Could you please verify if you have a ClusterRole
similar to this one: https://github.com/FoundationDB/fdb-kubernetes-operator/blob/main/config/samples/deployment.yaml#L6-L19 for your operator deployment? The error that you copied says that the operator is not allowed to list nodes (and therefore cannot check the taints). If the ClusterRole
exists, you have to make sure that there is a ClusterRoleBinding
for your service account, similar to: https://github.com/FoundationDB/fdb-kubernetes-operator/blob/main/config/samples/deployment.yaml#L141-L152
Hello,
Thank you for your reply. Yes, we did not have that before but added it to make it work.
Is there any plan to add it to the official helm chart?
We don't maintain the helm-charts actively as they were contributed by the community (see: https://github.com/FoundationDB/fdb-kubernetes-operator/blob/main/README.md#using-helm). If you have the time to add it to the helm-charts and open a PR, that would be appreciated :)
I created one, please take a look #2093.
What happened?
We were experimenting with using
taintReplacementOptions
for rotating the pods of ourFDB
cluster onto new nodes, while upgrading our Kubernetes version. However, after applying thetaints
onto the nodes the taints were not detected by theoperator
.The operator logs showed the following error:
What did you expect to happen?
The taints on the
nodes
to be detected and thefdb-operator
to automatically delete and reschedule thecoordinator
,log
,stateless
, andstorage
pods onto new nodes.How can we reproduce it (as minimally and precisely as possible)?
Tainting the nodes running the
FDB
cluster pods with something similar to below:Patching the FDB cluster spec with something like below:
Anything else we need to know?
We added the required permissions to the
RBAC
role for the resourcesnodes
and it fixed the issue.Changes
We would like to merge to
main
if these changes are acceptable.FDB Kubernetes operator
FDB-operator
version:1.33.0
Kubernetes version
K8s version:
1.27.12
Cloud provider
AWS, EKS