vmware-archive / vsphere-storage-for-docker

vSphere Storage for Docker
https://vmware.github.io/vsphere-storage-for-docker
Apache License 2.0
251 stars 95 forks source link

Reduce disk-attach wait time in the Windows plugin (fixes #1780) #1991

Closed shuklanirdesh82 closed 6 years ago

shuklanirdesh82 commented 6 years ago

Reduce disk-attach wait time in the Windows plugin (fixes #1780)

Testbed scenario: Windows vm created on VMFS backed by ESX 6.0

Testing done: Yes

locally:

  1. Without change
    --- PASS: Test (337.80s)
    OK: 5 passed
    PASS
    ok      github.com/vmware/docker-volume-vsphere/tests/e2e       337.810s
  2. With this change
    --- PASS: Test (297.44s)
    PASS
    OK: 5 passed
    ok      github.com/vmware/docker-volume-vsphere/tests/e2e       297.450s

On CI: https://ci.vmware.run/vmware/docker-volume-vsphere/1776

  1. Master run:
    OK: 5 passed
    --- PASS: Test (222.51s)
    PASS
    ok      github.com/vmware/docker-volume-vsphere/tests/e2e   222.514s
  2. with change: https://ci.vmware.run/vmware/docker-volume-vsphere/1777
    OK: 5 passed
    --- PASS: Test (208.74s)
    PASS

test run logs: winPluginE2ETests.txt

/CC @shaominchen

shuklanirdesh82 commented 6 years ago

@govint Thanks for your comment!

The initial implementation was having some dead code that makes things complicated and hard to understand. I have corrected the code and tested on my local.

We do have some tests which created volume parallel and they are working fine. Please give another look and share your feedback.

shaominchen commented 6 years ago

@shuklanirdesh82 Can you please provide a test result that shows the performance improvement? For example, the total e2e testing time before and after this improvement?

Moreover, I believe there must be some reason that we have introduced a single PowerShell session (which was removed in the current PR). I can't recall the details though - maybe @pshahzeb can comment on this?

shuklanirdesh82 commented 6 years ago

@shaominchen I've updated the bug description for timing that you asked.

There is not much significant difference (~337s vs ~297s) but the good thing is that threshold value maxDiskAttachWaitSec is set correctly 120 secs to 30 secs.

Please refer https://github.com/vmware/docker-volume-vsphere/pull/1991#discussion_r151551849 for some analysis as well.

shaominchen commented 6 years ago

Based on the reference on https://github.com/gorillalabs/go-powershell/blob/master/README.md, it looks like spawning multiple PowerShell processes is a valid option to support concurrency. So this change makes sense to me.

We do need a test to measure the performance improvement - simply create a large number of volumes (say 100, or 500) in parallel and we should see the difference. It's very easy to add this test: almost same as existing TestValidName, but just creating more volumes.

shuklanirdesh82 commented 6 years ago

@shaominchen newly added test is successfully creating 10 disks consistently.

go test -v -timeout 50m -tags "runoncewin1 winutil" github.com/vmware/docker-volume-vsphere/tests/e2e
=== RUN   Test
2017/11/18 01:06:36 VM name is: windows
2017/11/18 01:06:36 START: ParallelVolumeCreateTestSuite.TestVolumeCreationParallel
2017/11/18 01:06:36 Creating volume [pvol_9_volume_990730] on VM [10.20.104.126]
2017/11/18 01:06:36 Creating volume [pvol_6_volume_356473] on VM [10.20.104.126]
2017/11/18 01:06:36 Creating volume [pvol_4_volume_113544] on VM [10.20.104.126]
2017/11/18 01:06:36 Creating volume [pvol_2_volume_280360] on VM [10.20.104.126]
2017/11/18 01:06:36 Creating volume [pvol_3_volume_416963] on VM [10.20.104.126]
2017/11/18 01:06:36 Creating volume [pvol_1_volume_416963] on VM [10.20.104.126]
2017/11/18 01:06:36 Creating volume [pvol_5_volume_388319] on VM [10.20.104.126]
2017/11/18 01:06:36 Creating volume [pvol_0_volume_388319] on VM [10.20.104.126]
2017/11/18 01:06:36 Creating volume [pvol_7_volume_1079825] on VM [10.20.104.126]
2017/11/18 01:06:36 Creating volume [pvol_8_volume_506529] on VM [10.20.104.126]
2017/11/18 01:08:45 Checking volume [pvol_6_volume_356473 pvol_3_volume_416963 pvol_2_volume_280360 pvol_1_volume_416963 pvol_4_volume_113544 pvol_9_volume_990730 pvol_5_volume_388319 pvol_7_volume_1079825 pvol_8_volume_506529 pvol_0_volume_388319] availability from VM [10.20.104.126]
2017/11/18 01:08:46 END: ParallelVolumeCreateTestSuite.TestVolumeCreationParallel
OK: 1 passed
--- PASS: Test (130.23s)
PASS
ok      github.com/vmware/docker-volume-vsphere/tests/e2e       130.238s
shaominchen commented 6 years ago

@shuklanirdesh82 What about 20, 30, 40, 50? If it fails on 20 or 30, that's fine, but it will be useful to figure out how many concurrent requests we support.