kubernetes-sigs / alibaba-cloud-csi-driver

CSI Plugin for Kubernetes, Support Alibaba Cloud EBS/NAS/OSS/CPFS
Apache License 2.0
536 stars 239 forks source link

K8s master node error:failed to create newCsiDriverClient: driver name nasplugin.csi.alibabacloud.com not found in the list of registered CSI drivers #1052

Closed fanchibaoLiu closed 5 months ago

fanchibaoLiu commented 5 months ago

After installing the CSI plugin in the k8s built by Kubeadm, the master node reported an error: failed to create newCsiDriverClient: driver name nasplugin.csi.alibabacloud.com not found in the list of registered CSI drivers

The node is not a problem and can be mounted normally.

Events: Warning FailedMount 46s (x58 over 102m) kubelet MountVolume.MountDevice failed for volume "nas-176f411d-a710-4954-b064-e3c2add7817e" : kubernetes.io/csi: attacher.MountDevice failed to create newCsiDriverClient: driver name nasplugin.csi.alibabacloud.com not found in the list of registered CSI drivers

`# kubectl get csinodes.storage.k8s.io

NAME DRIVERS AGE kubernetes-master-001 0 30d kubernetes-node-001 1 30d kubernetes-node-002 1 30d kubernetes-node-003 1 30d `

Environment:

Can you help me take a look at this issue? Are there any special restrictions on the master node?

huww98 commented 5 months ago

This is too old, and we do not provide support for personal blog posts. Please refer to https://github.com/kubernetes-sigs/alibaba-cloud-csi-driver/blob/master/docs/install.md

fanchibaoLiu commented 5 months ago

This is too old, and we do not provide support for personal blog posts. Please refer to https://github.com/kubernetes-sigs/alibaba-cloud-csi-driver/blob/master/docs/install.md

think you,But if you uninstall the old version and then install the new version, will it have an impact on the previously created and mounted volumes?

If it has no impact on the data and does not affect the pod currently mounted on NAS, then I think this can be done.

huww98 commented 5 months ago

Running Pods are usually not affected.

Your posted events seems indicating the csi-plugin is not running on your master node. Please check its log.

fanchibaoLiu commented 5 months ago

pod:

[root@kubernetes-master-001 ~]# kubectl get pod -o wide -n kube-system | grep csi 
csi-plugin-5tvvd                                2/2     Running   0               31d     kubernetes-node-003     <none>           <none>
csi-plugin-5xwnj                                2/2     Running   0               31d     kubernetes-node-002     <none>           <none>
csi-plugin-jghj6                                2/2     Running   0               22h     kubernetes-master-001   <none>           <none>
csi-plugin-qlkg6                                2/2     Running   0               31d     kubernetes-node-001     <none>           <none>
csi-provisioner-65b5c6bc84-zk6bc                2/2     Running   0               30d     kubernetes-master-001   <none>           <none>

master csi logs:

[root@kubernetes-master-001 ~]# kubectl logs -f -n kube-system csi-plugin-jghj6 -c csi-plugin
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    12  100    12    0     0    824      0 --:--:-- --:--:-- --:--:--   857
Running nas plugin....
time="2024-05-07T11:48:01+08:00" level=info msg="Multi CSI Driver Name: nas, nodeID: kubernetes-master-001, endPoints: unix://var/lib/kubelet/csi-plugins/driverplugin.csi.alibabacloud.com-replace/csi.sock"
time="2024-05-07T11:48:01+08:00" level=info msg="CSI Driver Branch: 'master', Version: 'v1.18.8.47-906bd535-aliyun', Build time: '2021-05-13-20:56:55'\n"
time="2024-05-07T11:48:01+08:00" level=info msg="Create Stroage Path: /var/lib/kubelet/csi-plugins/nasplugin.csi.alibabacloud.com/controller"
time="2024-05-07T11:48:01+08:00" level=info msg="Create Stroage Path: /var/lib/kubelet/csi-plugins/nasplugin.csi.alibabacloud.com/node"
time="2024-05-07T11:48:01+08:00" level=info msg="CSI is running status."
time="2024-05-07T11:48:01+08:00" level=info msg="Metric listening on address: /healthz"
time="2024-05-07T11:48:01+08:00" level=info msg="Driver: nasplugin.csi.alibabacloud.com version: 1.0.0"
I0507 11:48:01.862318   20214 driver.go:93] Enabling volume access mode: MULTI_NODE_MULTI_WRITER
I0507 11:48:01.862327   20214 driver.go:81] Enabling controller service capability: CREATE_DELETE_VOLUME
I0507 11:48:01.862332   20214 driver.go:81] Enabling controller service capability: PUBLISH_UNPUBLISH_VOLUME
I0507 11:48:01.862336   20214 driver.go:81] Enabling controller service capability: EXPAND_VOLUME
W0507 11:48:01.862346   20214 client_config.go:552] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
time="2024-05-07T11:48:01+08:00" level=info msg="Metric listening on address: /metrics"
time="2024-05-07T11:48:01+08:00" level=info msg="Not found configmap named as csi-plugin under kube-system, with error: configmaps \"csi-plugin\" not found"
time="2024-05-07T11:48:01+08:00" level=info msg="Describe node kubernetes-master-001 and set RunTimeClass to runc"
time="2024-05-07T11:48:01+08:00" level=info msg="Node InternalIP is: 10.xxx.xxx.92"
time="2024-05-07T11:48:01+08:00" level=info msg="NAS Global Config: { false false false false runc kubernetes-master-001 10.xxx.xxx.92  false 0xc0004cc2c0 <nil>}"
time="2024-05-07T11:48:01+08:00" level=error msg="GetSTSToken: request roleInfo with error: parse \"http://100.100.100.200/latest/meta-data/ram/security-credentials/<?xml version=\\\"1.0\\\" encoding=\\\"iso-8859-1\\\"?>\\n<!DOCTYPE html PUBLIC \\\"-/W3C/DTD XHTML 1.0 Transitional/EN\\\"\\n         \\\"http:/www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\\\">\\n<html xmlns=\\\"http:/www.w3.org/1999/xhtml\\\" xml:lang=\\\"en\\\" lang=\\\"en\\\">\\n <head>\\n  <title>404 - Not Found</title>\\n </head>\\n <body>\\n  <h1>404 - Not Found</h1>\\n </body>\\n</html>\\n\": net/url: invalid control character in URL"
time="2024-05-07T11:48:01+08:00" level=info msg="Get AK: use STS"
time="2024-05-07T11:48:01+08:00" level=info msg="NewControllerServer: current provisioenr nas limit is 2"
W0507 11:48:01.912569   20214 client_config.go:552] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0507 11:48:01.913267   20214 server.go:108] Listening for connections on address: &net.UnixAddr{Name:"/var/lib/kubelet/csi-plugins/nasplugin.csi.alibabacloud.com/csi.sock", Net:"unix"}

When creating a new mount volume, there are no new logs, only the pod event shows mount failure. At present, these logs are also have in other nodes.

huww98 commented 5 months ago

When creating a new mount volume, there are no new logs, only the pod event shows mount failure.

That may indicate your kubelet is not using /var/lib/kubelet as root dir. You may check the log of registrar container.

fanchibaoLiu commented 5 months ago

Yes, I have customized and specified the path for kubelet here: /data/kubelet/

The container to be mounted is always in the ContainerCreating state:

grafana-75f94d9d88-ndfp5                           0/1     ContainerCreating   0          2m29s

# kubectl logs -f -n loki grafana-75f94d9d88-ndfp5 
Error from server (BadRequest): container "grafana" in pod "grafana-75f94d9d88-ndfp5" is waiting to start: ContainerCreating

# describe
Events:
  Type     Reason       Age                 From               Message
  ----     ------       ----                ----               -------
  Normal   Scheduled    110s                default-scheduler  Successfully assigned loki/grafana-75f94d9d88-ndfp5 to kubernetes-master-001
  Warning  FailedMount  47s (x8 over 110s)  kubelet            MountVolume.MountDevice failed for volume "nas-1ca03f15-eadb-4908-b6a4-15460bb6480c" : kubernetes.io/csi: attacher.MountDevice failed to create newCsiDriverClient: driver name nasplugin.csi.alibabacloud.com not found in the list of registered CSI drivers
fanchibaoLiu commented 5 months ago

I confirmed again, master kubelet = /data/kubelet, node kubelet = /var/lib/kubelet, csi-plugin mount=/var/lib/kubelet.

This should not be reasonable,The problem should be right here.

fanchibaoLiu commented 5 months ago

Indeed, I have changed all the root directories of kubelet to /var/log/kubelet,

Mount again, successful! Thank you very much for your reminder.