vmware-archive / vsphere-storage-for-docker

vSphere Storage for Docker
https://vmware.github.io/vsphere-storage-for-docker
Apache License 2.0
251 stars 95 forks source link

[vFile] Volume creation failed with vFile plugin #1983

Closed ashahi1 closed 6 years ago

ashahi1 commented 6 years ago

Volume creation failed while testing vFile plugin with VMs on 6.0 ESX. I saw this failure on CI.

I verified that docker swarm is working correctly using instructions mentioned here - http://vmware.github.io/docker-volume-vsphere/documentation/vfile-plugin.html#i-got-operation-now-in-progress-error-when-mounting-a-vfile-volume-to-a-container


Test name: VFileDemotePromoteTestSuite.TestSwarmRoleChange VM: 192.168.31.163 (linux vm) CI: https://ci.vmware.run/vmware/docker-volume-vsphere/1752

vFile log:

2017-11-12 15:33:10.579076914 -0800 PST [INFO] Node is demoted from manager to worker, prepare to leave ETCD cluster
2017-11-12 15:33:10.580333016 -0800 PST [INFO] Remove self from ETCD member due to demotion. nodeID=rqswdo912oaw0fhjfowqyhe7c peerAddr="http://192.168.31.163:2380"
2017-11-12 15:33:10.598485513 -0800 PST [INFO] Successfully removed self from ETCD peerAddr="http://192.168.31.163:2380" member.ID=17899894872837527706
2017-11-12 15:33:10.623562887 -0800 PST [INFO] Stopped ETCD service due to demotion
2017-11-12 15:33:20.200600493 -0800 PST [INFO] VolumeDriver Get: vfilevolume251513
2017-11-12 15:33:22.211180032 -0800 PST [WARNING] Transactional metadata read failed: context deadline exceeded.Swarm cluster maybe unhealthy
2017-11-12 15:33:22.21125184 -0800 PST [WARNING] Failed to read metadata for volume vfilevolume251513 from KV store. Transactional metadata read failed: context deadline exceeded.Swarm cluster maybe unhealthy
2017-11-12 15:33:22.211275091 -0800 PST [ERROR] Failed to get volume meta-data name=vfilevolume251513 error="Failed to read metadata for volume vfilevolume251513 from KV store. Transactional metadata read failed: context deadline exceeded.Swarm cluster maybe unhealthy"
2017-11-12 15:33:22.212036537 -0800 PST [INFO] VolumeDriver Create: vfilevolume251513
2017-11-12 15:33:22.212059497 -0800 PST [INFO] Attempting to write initial metadata entry for vfilevolume251513
2017-11-12 15:33:24.225412931 -0800 PST [WARNING] Failed to write metadata: etcdserver: request timed out.
2017-11-12 15:33:24.225481278 -0800 PST [WARNING] Failed to create volume vfilevolume251513. Reason: Failed to write metadata: etcdserver: request timed out.

============================

Test log:

2017/11/12 23:33:19 Creating vFile volume [vfilevolume251513] on VM [192.168.31.163]

----------------------------------------------------------------------
FAIL: vfile_demote_promote_test.go:90: VFileDemotePromoteTestSuite.TestSwarmRoleChange

vfile_demote_promote_test.go:115:
   c.Assert(err, IsNil, Commentf(out))
... value *exec.ExitError = &exec.ExitError{ProcessState:(*os.ProcessState)(0xc420198420), Stderr:[]uint8(nil)} ("exit status 1")
... Error response from daemon: create vfilevolume251513: VolumeDriver.Create: Failed to create volume vfilevolume251513. Reason: Failed to write metadata: etcdserver: request timed out.
tusharnt commented 6 years ago

@ashahi1 has already fixed the test to increase the timeout.

1990 aims to document how the customer can troubleshoot issues with swarm cluster.

ashahi1 commented 6 years ago

Test was fixed in pull request #1949