Closed crazytaxii closed 9 months ago
@crazytaxii Thanks for raising issue. It seems that rbac settings of nodelifecycle had been missed. would you like to make a pull request to fix it?
/assign @crazytaxii
It has been fixed in #1884.
The entire system:controller:node-controller ClusterRole for kube-controller-manager in Kubernetes cluster v1.27.2 is:
# ...
rules:
- apiGroups:
- ""
resources:
- nodes
verbs:
- delete
- get
- list
- patch
- update
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
- update
- apiGroups:
- ""
resources:
- pods/status
verbs:
- patch
- update
- apiGroups:
- ""
resources:
- pods
verbs:
- delete
- list
- apiGroups:
- networking.k8s.io
resources:
- clustercidrs
verbs:
- create
- get
- list
- update
- apiGroups:
- ""
- events.k8s.io
resources:
- events
verbs:
- create
- patch
- update
- apiGroups:
- ""
resources:
- pods
verbs:
- get
Compare to the ClusterRole of yurt-manager(v1.4):
# ...
- apiGroups:
- ""
resources:
- nodes
verbs:
# - delete # missing one
- get
- list
- patch
- update
- watch # extra one
# - apiGroups: # missing one
# - ""
# resources:
# - nodes/status
# verbs:
# - patch
# - update
- apiGroups:
- ""
resources:
- pods
verbs:
- create # extra one
- delete
- get
- list
- patch # extra one
- update # extra one
- watch # extra one
- apiGroups:
- ""
resources:
- pods/status
verbs:
# - patch # missing one
- update
# - apiGroups: # missing one
# - networking.k8s.io
# resources:
# - clustercidrs
# verbs:
# - create
# - get
# - list
# - update
# - apiGroups: # missing one
# - ""
# - events.k8s.io
# resources:
# - events
# verbs:
# - create
# - patch
# - update
# ...
But the node lifecycle controller in yurt-manager differs a lot from the one in kube-controller-manager v1.27.2 definitely.
clustercidrs
@crazytaxii Except networking.k8s.io/clustercidrs
resource, other missed rbac settings should be added to yurt-manager.
because networking.k8s.io/clustercidrs
is used by node ipam controller
in kube-controller-manager, and it is not needed by nodelifecycle
controller.
What happened: Node always stays with
Ready
status after stopping kubelet on it, even shutting down the node itself. The bug causes the Pods can not be migrated to other nodes.What you expected to happen: The abnormal node should be updated into
NotReady
status.How to reproduce it (as minimally and precisely as possible): Stopping the kubelet on a node.
Anything else we need to know?: Error log in yurt-manager's node lifecycle controller:
nodes/status is a subresource, it should be added to the ClusterRole of yurt-manager also.
Environment:
kubectl version
): v1.27.2/kind bug