vmware-archive / vsphere-storage-for-docker

vSphere Storage for Docker
https://vmware.github.io/vsphere-storage-for-docker
Apache License 2.0
251 stars 95 forks source link

clone volume fails VolumeDriver.Create: Server returned an error: TypeError("'NoneType' object does not support item assignment",) #2074

Closed carcampbell closed 6 years ago

carcampbell commented 6 years ago

Trying to clone a volume and it fails.
Already have the latest version of docker plugin: docker plugin install --grant-all-permissions --alias vsph ere vmware/vsphere-storage-for-docker:latest Error response from daemon: plugin "vsphere:latest" already exists

Just installed latest VIB Bundle: [root@esxi-cc1:~] esxcli storage guestvol status === Service: Version: 0.21.c420818-0.0.1 Status: Running Pid: 1852563 Port: 1019 LogConfigFile: /etc/vmware/vmdkops/log_config.json LogFile: /var/log/vmware/vmdk_ops.log LogLevel: INFO === Authorization Config DB: DB_Mode: NotConfigured (no local DB, no symlink to shared DB) DB_SharedLocation: N/A DB_LocalPath: N/A

Commands: docker volume create --driver=vsphere --name=MyNewVolume -o size=10gb -o attach-as=persistent MyNewVolume

docker volume create --driver=vsphere --name=CloneNewVolume -o clone-from=MyNewVolume -o access=read-only Error response from daemon: create CloneNewVolume: VolumeDriver.Create: Server returned an error: TypeError("'NoneType' object does not support item assignment",)

ESXi 6.5 Error log: 03/12/18 17:11:06 2034114 [cc-worker02-CC-Large-Datastore._DEFAULT.CloneNewVolume] [INFO ] executeRequest 'get' completed with ret={'Error': 'Volume 03/12/18 17:11:06 2034114 [MainThread] [INFO ] Started new thread : 150138259200 with target <function execRequestThread at 0x22f2efdbf8> and args (9 03/12/18 17:11:06 2034114 [Thread-4143] [INFO ] Auth DB /etc/vmware/vmdkops/auth-db is missing, allowing all access 03/12/18 17:11:06 2034114 [cc-worker02-CC-Large-Datastore._DEFAULT.CloneNewVolume] [INFO ] createVMDK: /vmfs/volumes/CC-Large-Datastore/dockvols/ 03/12/18 17:11:06 2034114 [cc-worker02-CC-Large-Datastore._DEFAULT.CloneNewVolume] [INFO ] cloneVMDK: /vmfs/volumes/CC-Large-Datastore/dockvols/_ 03/12/18 17:11:06 2034114 [cc-worker02-CC-Large-Datastore._DEFAULT.CloneNewVolume] [ERROR ] Failed to access b'/vmfs/volumes/CC-Large-Datastore/dockvo Traceback (most recent call last): File "/usr/lib/vmware/vmdkops/Python/kvESX.py", line 297, in load with open(meta_file, "r") as fh: FileNotFoundError: [Errno 2] No such file or directory: b'/vmfs/volumes/CC-Large-Datastore/dockvols/_DEFAULT/CloneNewVolume-508e4af18efbf23e.vmfd' 03/12/18 17:11:06 2034114 [cc-worker02-CC-Large-Datastore._DEFAULT.CloneNewVolume] [ERROR ] Unhandled Exception: Traceback (most recent call last): File "/usr/lib/vmware/vmdkops/bin/vmdk_ops.py", line 1843, in execRequestThread opts=opts) File "/usr/lib/vmware/vmdkops/bin/vmdk_ops.py", line 1148, in executeRequest vm_datastore=vm_datastore) File "/usr/lib/vmware/vmdkops/bin/vmdk_ops.py", line 208, in createVMDK vm_datastore=vm_datastore) File "/usr/lib/vmware/vmdkops/bin/vmdk_ops.py", line 394, in cloneVMDK vol_meta[kv.CREATED_BY] = vm_name TypeError: 'NoneType' object does not support item assignment

carcampbell commented 6 years ago

I can see the volumes have been created but the CloneNewVolume can't be mounted as it gets VolumeDriver.Mount: EOF error: [root@cc-worker02 ~]# docker volume ls DRIVER VOLUME NAME vsphere:latest CloneNewVolume@CC-Large-Datastore vsphere:latest MyNewVolume@CC-Large-Datastore

docker run -it --rm -v CloneNewVolume:/tmp alpine sh -c "echo testingClone12345 > /tmp/mytest.out" docker: Error response from daemon: VolumeDriver.Mount: EOF.

carcampbell commented 6 years ago

vmdk_ops.log vsphere-storage-for-docker.log

govint commented 6 years ago

This problem is exactly what was solved in the patch release made in 21.1. The patch was made to handle a regression in vib version 21.

@carcampbell can please update to this version - https://bintray.com/vmware/vDVS/download_file?file_path=VDVS_driver-0.21.1-offline_bundle-7812185.zip and let us know if this works for you. This has a fix made in #2061

carcampbell commented 6 years ago

I already did that- I am running VDVS_Driver-0.21.1 on all my ESXi hosts.
[root@esxi-cc1:~] esxcli storage guestvol status === Service: Version: 0.21.c420818-0.0.1 Status: Running Pid: 1852563 Port: 1019 LogConfigFile: /etc/vmware/vmdkops/log_config.json LogFile: /var/log/vmware/vmdk_ops.log

What I'm not clear about is if there is another docker plugin file. If I try to install version 0.21.1 it says the file doesn't exists.

docker plugin install --grant-all-permissions --alias vsphere vmware/vsphere-storage-for-docker:0.21.1 Error response from daemon: manifest for vmware/vsphere-storage-for-docker:0.21.1 not found

govint commented 6 years ago

@carcampbell, ok what version did you have before this. Because vib version 0.21.1 fixed a regression in v0.21. Any volumes created with the v0.21 will show this problem. Do any new volumes show this issue?, likely not. I'd suggest removing the volumes created with v0.21, because thats where the regression started.

carcampbell commented 6 years ago

I have deleted all the volumes and recreated new ones multiple times since I upgraded to 0.21.1

carcampbell commented 6 years ago

I tried it again this morning- a co-worker had said it failed for him but worked second try and I had the same experience. First request to clone fails, second request succeeds.

[root@cc-worker01 ~]# docker volume ls DRIVER VOLUME NAME vsphere:latest prom_cc-db-data@CC-Large-Datastore local ucp-auth-api-certs local ucp-auth-store-certs local ucp-auth-store-data local ucp-auth-worker-certs local ucp-auth-worker-data local ucp-client-root-ca local ucp-cluster-root-ca local ucp-controller-client-certs local ucp-controller-server-certs local ucp-kv local ucp-kv-certs local ucp-metrics-data local ucp-metrics-inventory local ucp-node-certs [root@cc-worker01 ~]# docker volume create -d vsphere test1@CC-Large-Datastore - o size=2Gb test1@CC-Large-Datastore [root@cc-worker01 ~]# docker volume create -d vsphere clone@CC-Large-Datastore - o clone-from=test1@CC-Large-Datastore -o access=read-only Error response from daemon: create clone@CC-Large-Datastore: VolumeDriver.Create : Server returned an error: TypeError("'NoneType' object does not support item a ssignment",) [root@cc-worker01 ~]# docker volume create -d vsphere clone@CC-Large-Datastore - o clone-from=test1@CC-Large-Datastore -o access=read-only clone@CC-Large-Datastore [root@cc-worker01 ~]#

govint commented 6 years ago

@carcampbell, ok let me check this out. Because I'm able to clone no problem with the same version.

govint commented 6 years ago

docker volume create -d vsphere test1@sharedVmfs-0 -o size=512Mb test1@sharedVmfs-0

docker volume inspect test1 [ { "Driver": "vsphere:latest", "Labels": null, "Mountpoint": "/mnt/vmdk/test1/", "Name": "test1", "Options": {}, "Scope": "global", "Status": { "access": "read-write", "attach-as": "independent_persistent", "capacity": { "allocated": "21MB", "size": "512MB" }, "clone-from": "None", "created": "Fri Mar 16 19:33:18 2018", "created by VM": "ubuntu-VM1.0", "datastore": "sharedVmfs-0", "diskformat": "thin", "fstype": "ext4", "status": "detached" } } ]

docker volume create -d vsphere clone@sharedVmfs-0 -o clone-from=test1@sharedVmfs-0 -o access=read-only clone@sharedVmfs-0

docker volume inspect clone [ { "Driver": "vsphere:latest", "Labels": null, "Mountpoint": "/mnt/vmdk/clone/", "Name": "clone", "Options": {}, "Scope": "global", "Status": { "access": "read-only", "attach-as": "independent_persistent", "capacity": { "allocated": "21MB", "size": "512MB" }, "clone-from": "test1", "created": "Fri Mar 16 19:34:55 2018", "created by VM": "ubuntu-VM1.0", "datastore": "sharedVmfs-0", "diskformat": "thin", "fstype": "ext4", "status": "detached" } } ]

@carcampbell, in /usr/lib/vmware/vmdkops/Python/kvESX.py can you confirm that the below variable has the value shown here. DVOL_KEY = "docker-volume-vsphere"

And if you are seeing the issue with the clone failing and then working can you please upload the ESX logs (vmdk_ops.log).

carcampbell commented 6 years ago

VSphere lib to access ESX proprietary APIs.

DISK_LIB64 = "/lib64/libvmsnapshot.so" DISK_LIB = "/lib/libvmsnapshot.so" lib = None use_sidecar_create = False DVOL_KEY = "docker-volume-vsphere"

carcampbell commented 6 years ago

[root@cc-worker01 ~]# docker volume create -d vsphere newtest@CC-Large-Datastore -o size=512Mb newtest@CC-Large-Datastore [root@cc-worker01 ~]# docker volume inspect newtest [ { "Driver": "vsphere:latest", "Labels": null, "Mountpoint": "/mnt/vmdk/newtest/", "Name": "newtest", "Options": {}, "Scope": "global", "Status": { "access": "read-write", "attach-as": "independent_persistent", "capacity": { "allocated": "21MB", "size": "512MB" }, "clone-from": "None", "created": "Fri Mar 16 20:16:46 2018", "created by VM": "cc-worker01", "datastore": "CC-Large-Datastore", "diskformat": "thin", "fstype": "ext4", "status": "detached" } } ] [root@cc-worker01 ~]# docker volume create -d vsphere clone1@CC-Large-Datastore -o clone-from=newtest@CC-Large-Datastore -o access=read-only Error response from daemon: create clone1@CC-Large-Datastore: VolumeDriver.Create: Server returned an error: TypeError("'NoneType' object does not support item assignment",) [root@cc-worker01 ~]# [root@cc-worker01 ~]# docker volume inspect clone1 [ { "Driver": "vsphere:latest", "Labels": null, "Mountpoint": "/mnt/vmdk/clone1/", "Name": "clone1", "Options": {}, "Scope": "global" } ] [root@cc-worker01 ~]#

carcampbell commented 6 years ago

and if I run the command again, it doesn't give me an error but it doesn't have the same output yours has: [root@cc-worker01 ~]# docker volume create -d vsphere clone1@CC-Large-Datastore -o clone-from=newtest@CC-Large-Datastore -o access=read-only clone1@CC-Large-Datastore [root@cc-worker01 ~]# docker volume inspect clone1 [ { "Driver": "vsphere:latest", "Labels": null, "Mountpoint": "/mnt/vmdk/clone1/", "Name": "clone1", "Options": {}, "Scope": "global" } ] [root@cc-worker01 ~]#

carcampbell commented 6 years ago

vmdk_ops.log

carcampbell commented 6 years ago

It looks like the clone actually doesn't work- even though it doesn't give an error if you run it a second time, if you try and mount the cloned volume you get an Error response from daemon: VolumeDriver.Mount:EOF.
So, the clone is still not working for me.

govint commented 6 years ago

The one difference is I'm on 6.0 U2 ESX which seems to be cloning the sidecar (used as the KV for a volume) corectly. Likely 6.5 seems to have an issue with the key used to clone the sidecar. Once I confirm that I'll change the code to create a fresh sidecar when cloning a volume.

govint commented 6 years ago

Seems like since ESX 6.5 there is an issue with how the disk clone is handled and consequently the sidecar file is being cloned with a key different from what the user specified. This is a platform issue and little that the volume driver can do anything about. As seen below, the clone has a sidecar but the name is different from what the volume driver expects,

FileNotFoundError: [Errno 2] No such file or directory: b'/vmfs/volumes/datastore1/dockvols/_DEFAULT/clone-09827946b4e8c5ab.vmfd' <------ expected name

03/18/18 06:12:35 1001397487 [ubuntu-VM2.3 (1)-datastore1._DEFAULT.clone] [ERROR ] Unhandled Exception: Traceback (most recent call last): File "/usr/lib/vmware/vmdkops/bin/vmdk_ops.py", line 1843, in execRequestThread opts=opts) File "/usr/lib/vmware/vmdkops/bin/vmdk_ops.py", line 1148, in executeRequest vm_datastore=vm_datastore) File "/usr/lib/vmware/vmdkops/bin/vmdk_ops.py", line 208, in createVMDK vm_datastore=vm_datastore) File "/usr/lib/vmware/vmdkops/bin/vmdk_ops.py", line 394, in cloneVMDK vol_meta[kv.CREATED_BY] = vm_name TypeError: 'NoneType' object does not support item assignment [1]+ Stopped (signal) tail -f /var/log/vmware/vmdk_ops.log [root@sc-rdops-vm18-dhcp-41-34:/vmfs/volumes/5aacde8e-d80d7a5f-f76c-020022862eb6/dockvols/11111111-1111-1111-1111-111111111111] ls -l total 26752 -rw------- 1 root root 4096 Mar 18 06:12 clone-1459bc590fdb855b.vmfd <----- actual name -rw------- 1 root root 104857600 Mar 18 06:12 clone-flat.vmdk -rw------- 1 root root 557 Mar 18 06:12 clone.vmdk -rw------- 1 root root 4096 Mar 18 06:05 test-1ff90327fd550211.vmfd -rw------- 1 root root 104857600 Mar 18 06:05 test-flat.vmdk -rw------- 1 root root 532 Mar 18 06:05 test.vmdk

This worked in ESX 6.0. I'll open a product issue to fix this, but with clone broken I'm going to fix this so its independent of the platform.

This will also be a precursor to supporting hot-clone of container volumes, allowing the user to create crash-consistent snaps of their data volumes.

carcampbell commented 6 years ago

Is there an estimate on when a fix will be available for this?

govint commented 6 years ago

Fixed via #2077

govint commented 6 years ago

@carcampbell please pull the current code and build the VIB and deploy that to your ESX hosts. Please let us know if you do see any issues with this fix.

carcampbell commented 6 years ago

Is there a vib to pull down already built? I'm not familiar enough with git and building code to build my own vib. I assume you tested it? I'm expecting a new vib posted here for all to use after you have completed testing. https://github.com/vmware/vsphere-storage-for-docker/releases
Then I expect to be able to put that into vSphere update manager and apply the patch across my cluster.

carcampbell commented 6 years ago

@govint please provide the vib and I will test with it. I tried building it but I don't have the proper environment to do so.

govint commented 6 years ago

@ashahi1 @shuklanirdesh82 can we build the VIB and upload to the site?

Given that clone of a container volume is broken on 6.5 ESX this may need a rebuild of the VIB earlier than the release schedule.

@carcampbell, to build this yourself all you would need is to pull the code (install git) and then run "make" to get the components. The build handles creating the env. to run the build (which itself runs within a container that has the compilers needed for the build.

carcampbell commented 6 years ago

@govint I was able to build it and test it and it does work. When do you expect this build to be publicly available? This fix is required for a solution we are building for customers and we don't want to ship a version we created especially since this is also 0.21.1 (same as currently published release).

shuklanirdesh82 commented 6 years ago

https://github.com/vmware/vsphere-storage-for-docker/issues/2074#issuecomment-376635403 @ashahi1 @shuklanirdesh82 can we build the VIB and upload to the site?

@ashahi1 Please take care of this and mark this VIB as 0.21.2 and create an entry at https://github.com/vmware/vsphere-storage-for-docker/releases

Thanks!

carcampbell commented 6 years ago

@govint @shuklanirdesh82 @ashahi1 I don't see any update to the VIB which fixes this problem yet. Any idea when it will post here? https://github.com/vmware/vsphere-storage-for-docker/releases

tusharnt commented 6 years ago

https://github.com/vmware/vsphere-storage-for-docker/releases/tag/0.21.2 contains the fix.