vmware / cloud-director-named-disk-csi-driver

Container Storage Interface (CSI) driver for VMware Cloud Director
Other
27 stars 31 forks source link

[BUG] in deployment Manifests? #275

Open 0hlov3 opened 6 months ago

0hlov3 commented 6 months ago

Describe the bug

Hi there,

I was going to deploy the CSI-Driver with the manifest manifests/csi-node.yaml, after that I got a lot of failures in my Cluster:

reflector.go:324] k8s.io/client-go/informers/factory.go:134: failed to list *v1.PersistentVolume: persistentvolumes is forbidden: User "system:serviceaccount:kube-system:csi-vcd-node-sa" cannot list resource "persistentvolumes" in API group "" at the cluster scope
reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:kube-system:csi-vcd-node-sa" cannot list resource "pods" in API group "" at the cluster scope

I think the Problem is that the default ClusterRole for csi-node has only:

---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: csi-nodeplugin-role
rules:
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]

I would fix this myself, but I don't know, what the csi-nodes really need. :(

Would you like to take a look please?

Reproduction steps

  1. kubectl apply -f https://raw.githubusercontent.com/vmware/cloud-director-named-disk-csi-driver/1.6.0/manifests/csi-controller.yaml

Expected behavior

I expect if I install/update the cis-driver everything works.

Additional context

No response

vitality411 commented 6 months ago

I have the same issue. As a workaround I extended the ClusterRoleBinding csi-resizer-binding in the csi-controller.yaml manifest by the csi-vcd-node-sa ServiceAccount:

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: csi-resizer-binding
subjects:
  - kind: ServiceAccount
    name: csi-vcd-controller-sa
    namespace: kube-system
  - kind: ServiceAccount
    name: csi-vcd-node-sa
    namespace: kube-system
roleRef:
  kind: ClusterRole
  name: csi-resizer-role
  apiGroup: rbac.authorization.k8s.io

After that the csi-resizer could list the resources:

csi-vcd-nodeplugin-2tfhc csi-resizer I0502 13:43:08.475791       1 reflector.go:255] Listing and watching *v1.Pod from k8s.io/client-go/informers/factory.go:134
csi-vcd-nodeplugin-2tfhc csi-resizer I0502 13:43:09.879953       1 reflector.go:255] Listing and watching *v1.PersistentVolumeClaim from k8s.io/client-go/informers/factory.go:134
csi-vcd-nodeplugin-2tfhc csi-resizer I0502 13:43:11.936945       1 reflector.go:255] Listing and watching *v1.PersistentVolume from k8s.io/client-go/informers/factory.go:134
arunmk commented 6 months ago

@0hlov3 @vitality411 could you delete and recreate the resources. Such as:

kubectl delete -f https://raw.githubusercontent.com/vmware/cloud-director-named-disk-csi-driver/1.6.0/manifests/csi-controller.yaml
kubectl apply -f https://raw.githubusercontent.com/vmware/cloud-director-named-disk-csi-driver/1.6.0/manifests/csi-controller.yaml

That should delete the old role and recreate a new one

0hlov3 commented 6 months ago

@0hlov3 @vitality411 could you delete and recreate the resources. Such as:

kubectl delete -f https://raw.githubusercontent.com/vmware/cloud-director-named-disk-csi-driver/1.6.0/manifests/csi-controller.yaml
kubectl apply -f https://raw.githubusercontent.com/vmware/cloud-director-named-disk-csi-driver/1.6.0/manifests/csi-controller.yaml

That should delete the old role and recreate a new one

Okay, but it seems that the csi-resizer is in https://github.com/vmware/cloud-director-named-disk-csi-driver/blob/1.6.0/manifests/csi-node.yaml and the csi-nodes are not using the Roles from csi-controller, they are using this Role:

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: csi-nodeplugin-role
rules:
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]

But I will go ahead and give it a try later.

vitality411 commented 6 months ago

@arunmk does not work for me:

csi-vcd-nodeplugin-7s4tr csi-resizer I0503 05:28:11.488060       1 reflector.go:255] Listing and watching *v1.PersistentVolumeClaim from k8s.io/client-go/informers/factory.go:134
csi-vcd-nodeplugin-7s4tr csi-resizer W0503 05:28:11.490523       1 reflector.go:324] k8s.io/client-go/informers/factory.go:134: failed to list *v1.PersistentVolumeClaim: persistentvolumeclaims is forbidden: User "system:serviceaccount:kube-system:csi-vcd-node-sa" cannot list resource "persistentvolumeclaims" in API group "" at the cluster scope
csi-vcd-nodeplugin-7s4tr csi-resizer E0503 05:28:11.490551       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.PersistentVolumeClaim: failed to list *v1.PersistentVolumeClaim: persistentvolumeclaims is forbidden: User "system:serviceaccount:kube-system:csi-vcd-node-sa" cannot list resource "persistentvolumeclaims" in API group "" at the cluster scope
arunmk commented 6 months ago

@vitality411 the same should be done with the node manifest also. A delete and reapply. Did you hit the error after doing that?

vitality411 commented 6 months ago

@arunmk Yes, same issue:

csi-vcd-nodeplugin-24z9g csi-resizer I0506 05:14:33.490748       1 reflector.go:255] Listing and watching *v1.Pod from k8s.io/client-go/informers/factory.go:134
csi-vcd-nodeplugin-24z9g csi-resizer W0506 05:14:33.491907       1 reflector.go:324] k8s.io/client-go/informers/factory.go:134: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:kube-system:csi-vcd-node-sa" cannot list resource "pods" in API group "" at the cluster scope
csi-vcd-nodeplugin-24z9g csi-resizer E0506 05:14:33.491927       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:kube-system:csi-vcd-node-sa" cannot list resource "pods" in API group "" at the cluster scope

IMHO the issue is obvious. The csi-vcd-nodeplugin DaemonSet is using the csi-vcd-node-sa ServiceAccount. The ClusterRoleBinding csi-nodeplugin-binding binds the ServiceAccount csi-vcd-node-sa to the ClusterRole csi-nodeplugin-role which only allows access to events resources.