CSINode not created - Githubissues

mattrcampbell commented 4 years ago

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug /kind feature

What happened: I am following the instructions provided at https://cloud-provider-vsphere.sigs.k8s.io/tutorials/kubernetes-on-vsphere-with-kubeadm.html in a new clean kubernetes v1.16.3 cluster running on rancheros using rancher ui v2.3.3.

After correcting tolerations on pods to run using the rancher applied taints, the workloads vsphere-cloud-controller-manager, vsphere-csi-controller, vsphere-csi-node all start and appear healthy.

All checks pass except one:

$ kubectl get CSINode

returns:

No resources found in default namespace.

What you expected to happen:

I would expect it to return one node for the one worker node configured in the cluster that has the vsphere-csi-node pod deployed.

How to reproduce it (as minimally and precisely as possible):

Create a new Kubernetes cluster in Rancher UI.
Add kublet: extra_args: cloud-provider:external to the yaml for the cluster
Create an etcd/control plane node and add the label node-role.kubernetes.io/master
Create a worker node
Follow the guide https://cloud-provider-vsphere.sigs.k8s.io/tutorials/kubernetes-on-vsphere-with-kubeadm.html starting at "Install the vSphere Cloud Provider Interface"
Adjust tolerations for the master node taints to vsphere-cloud-controller-manager, vsphere-csi-controller:
- node-role.kubernetes.io/controlplane=true:NoSchedule
- node-role.kubernetes.io/etcd=true:NoExecute

Anything else we need to know?:

Environment:

csi-vsphere version: v1.0.1
vsphere-cloud-controller-manager version: v1.1.0
Kubernetes version: v1.16.3
vSphere version: v6.7u3
OS (e.g. from /etc/os-release): RancherOS v1.5.4
Kernel (e.g. uname -a): 4.14.138-rancher
Install tools:
Others:

xing-yang commented 4 years ago

It seems that you have skipped some of the early setup steps? In section "Setting up VMs and Guest OS", there are some important steps such as set disk.EnableUUID=1 on all nodes, upgrade VM hardware, disable swap, etc. Have you done those steps?

mattrcampbell commented 4 years ago

Yes I have. What I meant by "starting at that line" is that is where I started following the article verbatim. Prior to that I completed all the prerequisites, but based on the specific variances required for rancheros. disk.EnableUUID=1 is enabled on all nodes and they are all running hwversion 15

xing-yang commented 4 years ago

Can you provide logs from kubelet and vSphere CSI driver?

mattrcampbell commented 4 years ago

When you say vSphere CSI driver logs, what do you mean? Is that from a particular pod? Also, my kubelet logs are massive, is there any way to pull the specific information that would be helpful for this issue, or is it typical to just dump the whole thing in?

xing-yang commented 4 years ago

For vSphere CSI driver, find the names of the controller and node pods:

kubectl get pod --namespace=kube-system
NAME                                     READY   STATUS    RESTARTS   AGE
vsphere-csi-controller-bc8bb7599-crvjh   6/6     Running   2          30h
vsphere-csi-node-mjljq                   3/3     Running   0          30h

Get logs from the driver containers:

kubectl logs vsphere-csi-controller-bc8bb7599-crvjh --namespace=kube-system vsphere-csi-controller
kubectl logs vsphere-csi-controller-bc8bb7599-crvjh --namespace=kube-system vsphere-syncer
kubectl logs vsphere-csi-node-mjljq --namespace=kube-system vsphere-csi-node

Also from node-driver-registrar:

kubectl logs vsphere-csi-node-mjljq --namespace=kube-system node-driver-registrar

xing-yang commented 4 years ago

For kubelet logs, search for the following strings first: CSINode CSI driver node updating node

mattrcampbell commented 4 years ago

I have attached vsphere logs. As for those particular strings, none of them appear in the kubelet logs. Searching for lower case 'csi' shows the typical workload/volume start up stuff, but that is all.

Thanks for your help on this!

controller.log node.log registrar.log syncer.log

xing-yang commented 4 years ago

In my node driver registrar logs, I have the following

I0114 15:02:19.522797       1 main.go:137] CSI driver name: "csi.vsphere.vmware.com"
I0114 15:02:19.606274       1 node_register.go:58] Starting Registration Server at: /registration/csi.vsphere.vmware.com-reg.sock
I0114 15:02:19.607290       1 node_register.go:67] Registration Server started at: /registration/csi.vsphere.vmware.com-reg.sock
I0114 15:02:20.172374       1 main.go:77] Received GetInfo call: &InfoRequest{}
I0114 15:02:21.168849       1 main.go:77] Received GetInfo call: &InfoRequest{}
I0114 15:02:21.202683       1 main.go:87] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:true,Error:,}

In your logs, it ended at "Registration Server started at...". It looks like the node registration didn't complete successfully.

Can you enable debug for the node CSI driver so we can get some debug messages in node.log? Also can you upload the kubelet logs.

mattrcampbell commented 4 years ago

When you say "enable debug" do you simply mean increase the -v value in the container args? If so, it did not change much:

I0116 19:21:28.088523 1 main.go:120] Attempting to open a gRPC connection with: "/csi/csi.sock"
I0116 19:21:28.088537 1 connection.go:151] Connecting to unix:///csi/csi.sock
I0116 19:21:29.089057 1 main.go:127] Calling CSI driver to discover driver name
I0116 19:21:29.089077 1 connection.go:180] GRPC call: /csi.v1.Identity/GetPluginInfo
I0116 19:21:29.089083 1 connection.go:181] GRPC request: {}
I0116 19:21:29.090863 1 connection.go:183] GRPC response: {"name":"csi.vsphere.vmware.com","vendor_version":"${VERSION}"}
I0116 19:21:29.091279 1 connection.go:184] GRPC error: <nil>
I0116 19:21:29.091326 1 main.go:137] CSI driver name: "csi.vsphere.vmware.com"
I0116 19:21:29.091400 1 node_register.go:54] Starting Registration Server at: /registration/csi.vsphere.vmware.com-reg.sock
I0116 19:21:29.091476 1 node_register.go:61] Registration Server started at: /registration/csi.vsphere.vmware.com-reg.sock

Kubelet logs attached. I did sanitize IP and DNS stuff....

kubelet-worker.log kubelet-master.log

xing-yang commented 4 years ago

Yes, I meant to change -v value to 4 for the vSphere CSI driver and collect logs from vsphere-csi-node container:

kubectl logs vsphere-csi-node-mjljq --namespace=kube-system vsphere-csi-node

mattrcampbell commented 4 years ago

Ok, that is what I did, the log is the same.

xing-yang commented 4 years ago

The logs you showed earlier is from node-driver-registrar though, not vsphere-csi-node. Can you double check?

mattrcampbell commented 4 years ago

Yes, that log did not change:

I0116 19:21:28.901802 1 service.go:88] configured: csi.vsphere.vmware.com with map[mode:node]
time="2020-01-16T19:21:28Z" level=info msg="identity service registered"
time="2020-01-16T19:21:28Z" level=info msg="node service registered"
time="2020-01-16T19:21:28Z" level=info msg=serving endpoint="unix:///csi/csi.sock"

xing-yang commented 4 years ago

Can you provide the deployment yaml file for vSphere driver?

mattrcampbell commented 4 years ago

> kubectl get csidriver.storage.k8s.io/csi.vsphere.vmware.com -o yaml
apiVersion: storage.k8s.io/v1beta1
kind: CSIDriver
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"storage.k8s.io/v1beta1","kind":"CSIDriver","metadata":{"annotations":{},"name":"csi.vsphere.vmware.com"},"spec":{"attachRequired":true,"podInfoOnMount":false}}
  creationTimestamp: "2020-01-15T17:31:42Z"
  name: csi.vsphere.vmware.com
  resourceVersion: "2170"
  selfLink: /apis/storage.k8s.io/v1beta1/csidrivers/csi.vsphere.vmware.com
  uid: b0c0f1c3-885d-4664-8a91-5b12c1b7054b
spec:
  attachRequired: true
  podInfoOnMount: false
  volumeLifecycleModes:
  - Persistent

xing-yang commented 4 years ago

I'm referring to the yaml files you used to start the vSphere CSI driver, i.e., the yaml file where you modified the -v setting. The example yaml files are here: https://github.com/kubernetes-sigs/vsphere-csi-driver/tree/master/manifests/1.14

Also after you changed the debug setting, did you restart the vSphere CSI driver? I wonder why the logging level were not changed after the debug setting change.

mattrcampbell commented 4 years ago

I used the exact YAML provided in the instructions:

kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/vsphere-csi-driver/master/manifests/1.14/deploy/vsphere-csi-controller-ss.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/vsphere-csi-driver/master/manifests/1.14/deploy/vsphere-csi-node-ds.yaml

The only changes I made post deploy are to adjust the tolerations for the rancher taints and add the change the -v value as requested. Yes, I redeployed the pods after editing the yaml.

xing-yang commented 4 years ago

Did you delete the vSphere CSI driver pods using "kubectl delete -f ..." and then create them again with "-v set to 4"? The logging level is still Info "level=info" as shown in the logs. Make sure if you changing the debug level for the csi driver container.

mattrcampbell commented 4 years ago

Yes, and I just did it again to make sure. The node appears to be running with --v 5 according to kubectl describe, yet the log still says level=info...

$ kubectl -n kube-system describe daemonset vsphere-csi-node
Name:           vsphere-csi-node
Selector:       app=vsphere-csi-node
Node-Selector:  <none>
Labels:         <none>
Annotations:    deprecated.daemonset.template.generation: 1
Desired Number of Nodes Scheduled: 1
Current Number of Nodes Scheduled: 1
Number of Nodes Scheduled with Up-to-date Pods: 1
Number of Nodes Scheduled with Available Pods: 1
Number of Nodes Misscheduled: 0
Pods Status:  1 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:  app=vsphere-csi-node
           role=vsphere-csi
  Containers:
   node-driver-registrar:
    Image:      quay.io/k8scsi/csi-node-driver-registrar:v1.1.0
    Port:       <none>
    Host Port:  <none>
    Args:
      --v=5
      --csi-address=$(ADDRESS)
      --kubelet-registration-path=$(DRIVER_REG_SOCK_PATH)
    Environment:
      ADDRESS:               /csi/csi.sock
      DRIVER_REG_SOCK_PATH:  /var/lib/kubelet/plugins_registry/csi.vsphere.vmware.com/csi.sock
    Mounts:
      /csi from plugin-dir (rw)
      /registration from registration-dir (rw)
   vsphere-csi-node:
    Image:      gcr.io/cloud-provider-vsphere/csi/release/driver:v1.0.1
    Port:       9808/TCP
    Host Port:  0/TCP
    Args:
      --v=5
    Liveness:  http-get http://:healthz/healthz delay=10s timeout=3s period=5s #success=1 #failure=3
    Environment:
      NODE_NAME:                   (v1:spec.nodeName)
      CSI_ENDPOINT:               unix:///csi/csi.sock
      X_CSI_MODE:                 node
      X_CSI_SPEC_REQ_VALIDATION:  false
      VSPHERE_CSI_CONFIG:         /etc/cloud/csi-vsphere.conf
    Mounts:
      /csi from plugin-dir (rw)
      /dev from device-dir (rw)
      /etc/cloud from vsphere-config-volume (ro)
      /var/lib/kubelet from pods-mount-dir (rw)
   liveness-probe:
    Image:      quay.io/k8scsi/livenessprobe:v1.1.0
    Port:       <none>
    Host Port:  <none>
    Args:
      --csi-address=$(ADDRESS)
    Environment:
      ADDRESS:  /csi/csi.sock
    Mounts:
      /csi from plugin-dir (rw)
  Volumes:
   vsphere-config-volume:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  vsphere-config-secret
    Optional:    false
   registration-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet/plugins_registry
    HostPathType:  DirectoryOrCreate
   plugin-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet/plugins_registry/csi.vsphere.vmware.com
    HostPathType:  DirectoryOrCreate
   pods-mount-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet
    HostPathType:  Directory
   device-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /dev
    HostPathType:  
Events:
  Type    Reason            Age   From                  Message
  ----    ------            ----  ----                  -------
  Normal  SuccessfulCreate  4m    daemonset-controller  Created pod: vsphere-csi-node-h2f7v

$ kubectl logs vsphere-csi-node-h2f7v --namespace=kube-system vsphere-csi-node
I0117 21:29:20.119600       1 service.go:88] configured: csi.vsphere.vmware.com with map[mode:node]
time="2020-01-17T21:29:20Z" level=info msg="identity service registered"
time="2020-01-17T21:29:20Z" level=info msg="node service registered"
time="2020-01-17T21:29:20Z" level=info msg=serving endpoint="unix:///csi/csi.sock"

mattrcampbell commented 4 years ago

Interesting. Upgrading rancher to v2.3.4 and Kubernetes to 1.17.0 results in the CSINodes being created. However, they driver section is empty:

Name:         XXX
Namespace:    
Labels:       <none>
Annotations:  <none>
API Version:  storage.k8s.io/v1
Kind:         CSINode
Metadata:
  Creation Timestamp:  2020-01-22T14:41:32Z
  Owner References:
    API Version:     v1
    Kind:            Node
    Name:            XXX
    UID:             c3dbd1aa-e3f2-4655-8273-aaaada208a5e
  Resource Version:  1175261
  Self Link:         /apis/storage.k8s.io/v1/csinodes/XXX
  UID:               a401f00d-9ea2-48ba-8065-247ed25d021e
Spec:
  Drivers:  <nil>
Events:     <none>

There is now more info in the node-driver-registrar, but still no errors:

I0122 17:33:52.653438 1 main.go:110] Version: v1.1.0-0-g80a94421
I0122 17:33:52.653485 1 main.go:120] Attempting to open a gRPC connection with: "/csi/csi.sock"
I0122 17:33:52.653500 1 connection.go:151] Connecting to unix:///csi/csi.sock
I0122 17:33:54.698479 1 main.go:127] Calling CSI driver to discover driver name
I0122 17:33:54.698516 1 connection.go:180] GRPC call: /csi.v1.Identity/GetPluginInfo
I0122 17:33:54.698521 1 connection.go:181] GRPC request: {}
I0122 17:33:54.714273 1 connection.go:183] GRPC response: {"name":"csi.vsphere.vmware.com","vendor_version":"${VERSION}"}
I0122 17:33:54.714659 1 connection.go:184] GRPC error: <nil>
I0122 17:33:54.714665 1 main.go:137] CSI driver name: "csi.vsphere.vmware.com"
I0122 17:33:54.721681 1 node_register.go:54] Starting Registration Server at: /registration/csi.vsphere.vmware.com-reg.sock
I0122 17:33:54.721800 1 node_register.go:61] Registration Server started at: /registration/csi.vsphere.vmware.com-reg.sock

xing-yang commented 4 years ago

Is this on the Kubernetes master node or worker node?

mattrcampbell commented 4 years ago

Both nodes return "nil" in the Drivers: section, worker AND master.

mattrcampbell commented 4 years ago

So I built up a cluster using the full instructions from my original ticket above and found an interesting difference between the kubeadm cluster and the RKE cluster. The RKE cluster does not deploy kubelet, etcd, and kube-apiserver as pods that are returned with a "kubectl get pods" command. They can be found with a docker ps on the node. No idea if this makes any significant difference or not....

deejay104 commented 4 years ago

Hello,

I am facing the same issue. For testing I have deploy a fresh new rancher cluster. Following the procedure everything went well and I got the csi driver registered (kubectl describe csinode was showing the driver in the spec). Unfortunately few hours later it disappeared and I am not able to have it added back. Kubectl get csinode is still showing the node of the cluster.

I have this issue on all of our clusters.

Thanks if someone can help

mattrcampbell commented 4 years ago

I have moved a little closer again. RKE starts kubelet with a different root:

--root-dir=/opt/rke/var/lib/kubelet

So I updated the volumes in vsphere-csi-node-ds.yaml to:

      volumes:
        - name: vsphere-config-volume
          secret:
            secretName: vsphere-config-secret
        - name: registration-dir
          hostPath:
            path: /opt/rke/var/lib/kubelet/plugins_registry
            type: DirectoryOrCreate
        - name: plugin-dir
          hostPath:
            path: /opt/rke/var/lib/kubelet/plugins_registry/csi.vsphere.vmware.com
            type: DirectoryOrCreate
        - name: pods-mount-dir
          hostPath:
            path: /opt/rke/var/lib/kubelet
            type: Directory
        - name: device-dir
          hostPath:
            path: /dev

Now the driver registers in the CSINode:

Name:         XXX
Namespace:    
Labels:       <none>
Annotations:  <none>
API Version:  storage.k8s.io/v1
Kind:         CSINode
Metadata:
  Creation Timestamp:  2020-01-28T16:27:38Z
  Owner References:
    API Version:     v1
    Kind:            Node
    Name:            XXX
    UID:             3d2994a7-27f4-4061-b928-64b9959f11de
  Resource Version:  44476
  Self Link:         /apis/storage.k8s.io/v1/csinodes/clawsdev2b
  UID:               d7a45ee6-1798-40d2-9b7f-e8a267a0365c
Spec:
  Drivers:
    Name:           csi.vsphere.vmware.com
    Node ID:        XXX
    Topology Keys:  <nil>
Events:             <none>

However, the attacher is failing:

E0128 18:09:27.172799   29585 csi_attacher.go:270] kubernetes.io/csi: attacher.MountDevice failed: rpc error: code = FailedPrecondition desc = target: /opt/rke/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-1be01c41-8ca6-4824-958c-562f7fb25176/globalmount not pre-created

xing-yang commented 4 years ago

Do you see a directory "/opt/rke/var/lib/kubelet/plugins" created?

xing-yang commented 4 years ago

Can you take a look of the controller manager logs, search for “attacher.MountDevice failed to create dir” or “created target path successfully”? It should tell you whether the path is created successfully or not.

deejay104 commented 4 years ago

I am not as lucking as you, if I change the path the pods does not start. Saying that it could not register. Revert back to /var/lib/kubelet, they start correctly. But csinode driver is still empty. (FYI, when I change it, /opt/rke/var/lib/kubelet/plugins_registry is populated with csi.sock file)

mattrcampbell commented 4 years ago

Yes, the "/opt/rke/var/lib/kubelet/plugins directory is getting created for me. Cloud Controller Manager does not contain either of those strings:

$ kubectl -n kube-system logs vsphere-cloud-controller-manager-z4gvr vsphere-cloud-controller-manager | egrep "attacher|created"

deejay104 commented 4 years ago

Well... After giving it another try by replacing all /var/lib/kubelet with /opt/rke/var/lib/kubelet it has worked ! Driver is now added to csinode and volume are mouting to pods. I have also applying the updated yaml to the other clusters and it has worked also. :-)

mattrcampbell commented 4 years ago

Well... After giving it another try by replacing all /var/lib/kubelet with /opt/rke/var/lib/kubelet it has worked ! Driver is now added to csinode and volume are mouting to pods. I have also applying the updated yaml to the other clusters and it has worked also. :-)

When you say "all" do you literally mean every instance of in both vsphere-csi-controller-ss.yaml and vsphere-csi-node-ds.yaml, including in the "env" variables and the local volumeMounts:mountPath?

deejay104 commented 4 years ago

"all" means all entry in vsphere-csi-node-ds.yaml including env, command and volumes.

mattrcampbell commented 4 years ago

"all" means all entry in vsphere-csi-node-ds.yaml including env, command and volumes.

Yup, I found I had missed one (csi-node volumeMount:pod-mount-dir) and after adding that it appears to be working. I will do some testing and close if that is indeed the case.

deejay104 commented 4 years ago

On my side just did few test and volume are creating and attaching correctly. I can also see them in the vcenter. Thanks for your help.

chethanv28 commented 4 years ago

/assign @chethanv28

mattrcampbell commented 4 years ago

I can confirm that the issue appears to be the path. Rancher uses /opt/rke/var/lib/kubelet. Once that was fixed everything seems to be working great. This issue can be closed, though a note in the documentation might be handy. :)

xing-yang commented 4 years ago

Thanks @mattrcampbell! I changed the label to documentation.

misterikkit commented 4 years ago

I believe kubelet is responsible for creating CSINode objects.

https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/csi/nodeinfomanager/nodeinfomanager.go

bkcsfi commented 4 years ago

Hi,

I am attempting to use vsphere-csi and cpi on a single node k3s v1.17.4

Everything works up to the point of attaching the pv to the node. The pvc and pv are created

It looks like the manifests where changed to use -v 4 already, can you please suggest where I can look to see why csinodes doesn't seem to be working?

I did have to explicitly pass --provider-id to kubelet, I'm not sure if that matters

root@dev01:/# kc get pvc
NAME        STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
testclaim   Bound    pvc-5077a6f5-ca41-4b8b-8ee0-b886ac9f23f1   100Mi      RWO            volk1-sc       179m
root@dev01:/# kc get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM               STORAGECLASS   REASON   AGE
pvc-5077a6f5-ca41-4b8b-8ee0-b886ac9f23f1   100Mi      RWO            Delete           Bound    default/testclaim   volk1-sc                179m

But attachment fails

csi-attacher I0413 00:48:40.762698       1 csi_handler.go:97] Error processing "csi-fdcaaa11c136b7b5f10a6ab98ef574bf6a635cd6f404b3faaba7fa4c391fac17": failed to attach: node "dev01" has no NodeID annotation

csidriver info

root@dev01:/# kc describe csidrivers
Name:         csi.vsphere.vmware.com
Namespace:    
Labels:       <none>
Annotations:  API Version:  storage.k8s.io/v1beta1
Kind:         CSIDriver
Metadata:
  Creation Timestamp:  2020-04-12T21:48:24Z
  Resource Version:    3448
  Self Link:           /apis/storage.k8s.io/v1beta1/csidrivers/csi.vsphere.vmware.com
  UID:                 16b3d59a-1451-40d4-9d4f-bd35f40d2622
Spec:
  Attach Required:    true
  Pod Info On Mount:  false
  Volume Lifecycle Modes:
    Persistent
Events:  <none>

but csinodes looks incomplete

root@dev01:/# kc describe csinodes
Name:               dev01
Labels:             <none>
Annotations:        <none>
CreationTimestamp:  Sun, 12 Apr 2020 17:02:29 -0400
Spec:
Events:  <none>

Similar to OP, the csi-node-driver-registrar seems to start, but does nothing after that

node-driver-registrar I0412 22:36:32.904075       1 main.go:110] Version: v1.1.0-0-g80a94421                                                             
node-driver-registrar I0412 22:36:32.904146       1 main.go:120] Attempting to open a gRPC connection with: "/csi/csi.sock"                              
node-driver-registrar I0412 22:36:32.904163       1 connection.go:151] Connecting to unix:///csi/csi.sock                                                
node-driver-registrar I0412 22:36:32.904930       1 main.go:127] Calling CSI driver to discover driver name                                              
node-driver-registrar I0412 22:36:32.904962       1 connection.go:180] GRPC call: /csi.v1.Identity/GetPluginInfo                                         
node-driver-registrar I0412 22:36:32.904970       1 connection.go:181] GRPC request: {}                                                                  
node-driver-registrar I0412 22:36:32.907007       1 connection.go:183] GRPC response: {"name":"csi.vsphere.vmware.com","vendor_version":"${VERSION}"}    
node-driver-registrar I0412 22:36:32.907945       1 connection.go:184] GRPC error: <nil>                                                                 
node-driver-registrar I0412 22:36:32.907954       1 main.go:137] CSI driver name: "csi.vsphere.vmware.com"                                               
node-driver-registrar I0412 22:36:32.908017       1 node_register.go:54] Starting Registration Server at: /registration/csi.vsphere.vmware.com-reg.sock  
node-driver-registrar I0412 22:36:32.908116       1 node_register.go:61] Registration Server started at: /registration/csi.vsphere.vmware.com-reg.sock

it seems like there's a lot of outdated documentation in many places for vsphere csi and cpi setup.

I am following docs from https://github.com/kubernetes/cloud-provider-vsphere/blob/master/docs/book/README.md and manifests from https://github.com/kubernetes-sigs/vsphere-csi-driver/tree/master/manifests/vsphere-67u3/vanilla

pods

NAME                                      READY   STATUS      RESTARTS   AGE     IP            NODE    NOMINATED NODE   READINESS GATES
vsphere-cloud-controller-manager-qfcg2    1/1     Running     0          4h24m   10.1.250.63   dev01   <none>           <none>
local-path-provisioner-58fb86bdfd-5psqk   1/1     Running     0          4h31m   10.42.0.3     dev01   <none>           <none>
metrics-server-6d684c7b5-jzf5v            1/1     Running     0          4h31m   10.42.0.5     dev01   <none>           <none>
helm-install-traefik-w2gs8                0/1     Completed   0          4h31m   10.42.0.2     dev01   <none>           <none>
svclb-traefik-w7vxw                       2/2     Running     0          4h24m   10.42.0.7     dev01   <none>           <none>
coredns-6c6bb68b64-d2txg                  1/1     Running     0          4h31m   10.42.0.4     dev01   <none>           <none>
traefik-7b8b884c8-9rnrp                   1/1     Running     0          4h24m   10.42.0.6     dev01   <none>           <none>
vsphere-csi-controller-0                  5/5     Running     0          3h45m   10.42.0.8     dev01   <none>           <none>
vsphere-csi-node-kzgnq                    3/3     Running     1          3h45m   10.42.0.9     dev01   <none>           <none>

node

root@dev01:/# kc describe node                                                                                                                                                      [34/16343]
Name:               dev01
Roles:              master
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=vsphere-vm.cpu-8.mem-16gb.os-ubuntu
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=dev01
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/master=true
Annotations:        flannel.alpha.coreos.com/backend-data: {"VtepMAC":"92:22:de:3d:6c:39"}
                    flannel.alpha.coreos.com/backend-type: vxlan
                    flannel.alpha.coreos.com/kube-subnet-manager: true
                    flannel.alpha.coreos.com/public-ip: 10.1.250.63
                    k3s.io/node-args:
                      ["server","--data-dir","/opt/k3s","--docker","--disable-cloud-controller","--kubelet-arg","cloud-provider=external","--kubelet-arg","provi...
                    k3s.io/node-config-hash: R2E7DPCB4GFB47UUF5B4OH4Q5YQTKFD4UCFOZYLB7RDSG4UJWYXQ====
                    k3s.io/node-env: {"K3S_DATA_DIR":"/opt/k3s/data/6a3098e6644f5f0dbfe14e5efa99bb8fdf60d63cae89fdffd71b7de11a1f1430"}
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Sun, 12 Apr 2020 17:02:29 -0400
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  dev01
  AcquireTime:     <unset>
  RenewTime:       Sun, 12 Apr 2020 21:34:58 -0400
Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  NetworkUnavailable   False   Sun, 12 Apr 2020 17:02:41 -0400   Sun, 12 Apr 2020 17:02:41 -0400   FlannelIsUp                  Flannel is running on this node
  MemoryPressure       False   Sun, 12 Apr 2020 21:31:22 -0400   Sun, 12 Apr 2020 17:02:28 -0400   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         False   Sun, 12 Apr 2020 21:31:22 -0400   Sun, 12 Apr 2020 17:02:28 -0400   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure          False   Sun, 12 Apr 2020 21:31:22 -0400   Sun, 12 Apr 2020 17:02:28 -0400   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                True    Sun, 12 Apr 2020 21:31:22 -0400   Sun, 12 Apr 2020 17:02:39 -0400   KubeletReady                 kubelet is posting ready status. AppArmor enabled
Addresses:
  Hostname:    dev01
  ExternalIP:  10.1.250.63
  InternalIP:  10.1.250.63
Capacity:
  cpu:                8
  ephemeral-storage:  7199560Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             16424924Ki
  pods:               110
Allocatable:
  cpu:                8
  ephemeral-storage:  7003731963
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             16424924Ki
  pods:               110
System Info:
  Machine ID:                 bfdf980d8e1b41239022cc7b06d8716a
  System UUID:                00833F42-B602-40CD-C014-0666F4C462D3
  Boot ID:                    23577ad1-a7d8-453f-b208-2e8245e746f3
  Kernel Version:             4.15.0-96-generic
  OS Image:                   Ubuntu 18.04.4 LTS
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  docker://19.3.8
  Kubelet Version:            v1.17.4+k3s1
  Kube-Proxy Version:         v1.17.4+k3s1
PodCIDR:                      10.42.0.0/24
PodCIDRs:                     10.42.0.0/24
ProviderID:                   vsphere://423f8300-02b6-cd40-c014-0666f4c462d3
Non-terminated Pods:          (9 in total)
  Namespace                   Name                                       CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE
  ---------                   ----                                       ------------  ----------  ---------------  -------------  ---
  kube-system                 vsphere-cloud-controller-manager-qfcg2     200m (2%)     0 (0%)      0 (0%)           0 (0%)         4h26m
  kube-system                 local-path-provisioner-58fb86bdfd-5psqk    0 (0%)        0 (0%)      0 (0%)           0 (0%)         4h32m
  kube-system                 metrics-server-6d684c7b5-jzf5v             0 (0%)        0 (0%)      0 (0%)           0 (0%)         4h32m
  kube-system                 svclb-traefik-w7vxw                        0 (0%)        0 (0%)      0 (0%)           0 (0%)         4h25m
  kube-system                 coredns-6c6bb68b64-d2txg                   100m (1%)     0 (0%)      70Mi (0%)        170Mi (1%)     4h32m
  kube-system                 traefik-7b8b884c8-9rnrp                    0 (0%)        0 (0%)      0 (0%)           0 (0%)         4h25m
  kube-system                 vsphere-csi-controller-0                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         3h46m
  default                     busybox                                    0 (0%)        0 (0%)      0 (0%)           0 (0%)         3h9m
  kube-system                 vsphere-csi-node-kzgnq                     0 (0%)        0 (0%)      0 (0%)           0 (0%)         3h46m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests   Limits
  --------           --------   ------
  cpu                300m (3%)  0 (0%)
  memory             70Mi (0%)  170Mi (1%)
  ephemeral-storage  0 (0%)     0 (0%)
  hugepages-1Gi      0 (0%)     0 (0%)
  hugepages-2Mi      0 (0%)     0 (0%)
Events:              <none>

shalini-b commented 4 years ago

/assign @xing-yang

benjaminguttmann-avtq commented 4 years ago

Hi there, I am facing a similar issue as described in this issue where the CSINode is not getting created:

kubectl get CSINode No resources found.

everything else looks fine so far. I am wondering if there is a minimal Kubernetes Version required for this to work. We are currently using v1.15.10. Does anyone know if it just works with v1.17.0 or higher?

In addition the log level does not change what could already be seen in this thread earlier. @xing-yang

benjaminguttmann-avtq commented 4 years ago

Just for documentation purposes the issue was also caused by wrong paths, that means adjusting the paths will fix the issue with CSI nodes not getting created.

fejta-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot commented 3 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten

fejta-bot commented 3 years ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close

k8s-ci-robot commented 3 years ago

@fejta-bot: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/vsphere-csi-driver/issues/124#issuecomment-715323422): >Rotten issues close after 30d of inactivity. >Reopen the issue with `/reopen`. >Mark the issue as fresh with `/remove-lifecycle rotten`. > >Send feedback to sig-testing, kubernetes/test-infra and/or [fejta](https://github.com/fejta). >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

kubernetes-sigs / vsphere-csi-driver

CSINode not created #124