answear / kube-vcloud-flexvolume

VMware Cloud Director flexVolume driver for Kubernetes
9 stars 5 forks source link

Timeout is_disk_connected = wait_for_connected_disk(600) #16

Open s4kro opened 5 years ago

s4kro commented 5 years ago

Greetings!

I have a problem with attach.py with timeout_for_connected_disk. Looks like method from pyudev.monitor def from_netlink could not connect to vm via timeout here

context = pyudev.Context()
    monitor = pyudev.Monitor.from_netlink(context)
    monitor.filter_by(subsystem='block', device_type='disk').

Can you also show me a simple output here via print(result)

result = []
    for device in iter(partial(monitor.poll, timeout), None):
        if device.action == 'add':
            result = [device.device_node, 'connected']
            break
        elif device.action == 'remove':
            result = [device.device_node, 'disconnected']
            break
    return result

I was trying to incrase timeout to 1000s but still not work, im not realy sure how pydev.monitor works. Rest works as intended. I see how app create disk, how attach/deattach disk to vm. Also I see disk via fdisk -l

Disk /dev/sdd: 1073 MB, 1073741824 bytes, 2097152 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

But I cant continue script, because it's reaching timeout for wait_for_connected_disk. Maybe need to modernize this method?

dzolnierz commented 5 years ago

Hi,

are you attaching volumes invoking vclod-flexvolume directly or via Kubernetes? If latter, does your kubelets running with --enable-controller-attach-detach=false? If not, the controller is trying to attach disk to random node (with success) and the whole pyudev logic is called on the controller also. And this will never succeed, because kernel emmits udev events on the target node where the disk was attached, not on the controller.

The same applies to invoking vcloud-flexvolume directly. Should be invoked on the very same node you would attach disk to.

s4kro commented 5 years ago

Thanks for reply! I tried to invoke flexvolume with both ways. Is this option (--enable-controller-attach-detach=false) should be at kubemaster (thats what I did) or at kubeworkers too? If i need to attach disk to kubeworker A, I need to execute vcloud-flexvolume attach directly on this node? If so i need to install flexvolume to all kubeworkers? I cant invoke script at workers because etcd runs only at master, and script trying to connect there, maybe i should do NodePort service foer etcd? Also i was trying to moke is_disk_connected with some random variable value, but w/o result because i dont know how output from below code looks like.

result = []
    for device in iter(partial(monitor.poll, timeout), None):
        if device.action == 'add':
            result = [device.device_node, 'connected']
            break
        elif device.action == 'remove':
            result = [device.device_node, 'disconnected']
            break
    return result
s4kro commented 5 years ago

I fixed attach.py here: cmd_create_partition = ("echo -n ',,83;' | sfdisk %s") % (device_name) now without "-n" and it works. Now back to k8s after deploy ngninx example pod i got Unable to mount volumes for pod "nginx-vcloud_default(fa012f2f-be98-11e9-8a60-005056011de7)": timeout expired waiting for volumes to attach or mount for pod "default"/"nginx-vcloud". list of unmounted volumes=[testdisk]. list of unattached volumes=[testdisk default-token-qbft4] Also i got at kubelet log: desiredStateOfWorld. err=failed to get Plugin from volumeSpec for volume "testdisk" err=no volume plugin matched Manifest:

apiVersion: v1
kind: Pod
metadata:
  name: nginx-vcloud
  namespace: default
spec:
  containers:
  - name: nginx-vcloud
    image: nginx
    volumeMounts:
    - name: testdisk
      mountPath: /data
    ports:
    - containerPort: 80
  volumes:
  - name: testdisk
    flexVolume:
      #driver: "sysoperator.pl/vcloud"
      driver: "answear.com/vcloud"
      fsType: "ext4"
      options:
        volumeName: "testdisk2"
        size: "1Gi"
        storage: "DC1-Kv-VSP-02-High"
        busType: "6"
        busSubType: "VirtualSCSI"
        mountoptions: "relatime,nobarrier"
dzolnierz commented 5 years ago

If i need to attach disk to kubeworker A, I need to execute vcloud-flexvolume attach directly on this node?

Yes.

If so i need to install flexvolume to all kubeworkers?

Yes.

I cant invoke script at workers because etcd runs only at master, and script trying to connect there, maybe i should do NodePort service foer etcd?

The driver needs access to etcd for locking to work properly. It does not have to be the cluster instance.

cmd_create_partition = ("echo -n ',,83;' | sfdisk %s") % (device_name)

What Linux distribution do you use? Does echo come from the coreutils package? Show me echo --help and lsb_release -a.

Also i got at kubelet log: desiredStateOfWorld. err=failed to get Plugin from volumeSpec for volume "testdisk" err=no volume plugin matched

Has the driver been installed correctly on every node? Has the kubelet process been restarted after installing the driver? Show me ls -la /usr/libexec/kubernetes/kubelet-plugins/volume/exec/ on the node where the error come from.

s4kro commented 5 years ago
[root@dev-test-worker-1 examples]# ls -la /usr/libexec/kubernetes/kubelet-plugins/volume/exec/
total 0
drwxr-xr-x 3 root root 32 Aug 12 18:52 .
drwxr-xr-x 3 root root 18 Aug 12 18:52 ..
drwxr-xr-x 2 root root 20 Aug 12 18:59 answear.com~vcloud

It's centos7.4. cmd_create_partition = ("echo -n ',,83;' | sfdisk %s") % (device_name) I fixed this one like this and it works. cmd_create_partition = ("echo ',,83;' | sfdisk %s") % (device_name) Still got k8s troubles. Cant deploy pod with volume. k8s version is 1.12.

The driver needs access to etcd for locking to work properly. It does not have to be the cluster instance.

Okay, got it. But if it is flexvolume won't work correctly. I tried to connect to NodePort, but no result.

dzolnierz commented 5 years ago
[root@dev-test-worker-1 examples]# ls -la /usr/libexec/kubernetes/kubelet-plugins/volume/exec/
total 0
drwxr-xr-x 3 root root 32 Aug 12 18:52 .
drwxr-xr-x 3 root root 18 Aug 12 18:52 ..
drwxr-xr-x 2 root root 20 Aug 12 18:59 answear.com~vcloud

Seems ok.

It's centos7.4. cmd_create_partition = ("echo -n ',,83;' | sfdisk %s") % (device_name) I fixed this one like this and it works. cmd_create_partition = ("echo ',,83;' | sfdisk %s") % (device_name)

This was fixed in #18.

The driver needs access to etcd for locking to work properly. It does not have to be the cluster instance.

Okay, got it. But if it is flexvolume won't work correctly. I tried to connect to NodePort, but no result.

Could you post me whole log from node's kubelet trying to attach volume? To gist.github.com?