vmware / vic

vSphere Integrated Containers Engine is a container runtime for vSphere.
http://vmware.github.io/vic
Other
640 stars 173 forks source link

vic-machine won't upload ISOs to vSAN 6.6 #5042

Closed corrieb closed 7 years ago

corrieb commented 7 years ago

I just created a proper non-nested vSphere 6.5 cluster with vSAN 6.6 and I get an error installing a VCH to it. The vSAN datastore is properly configured and I verified this by installing the VIC OVA to it successfully.

vic-machine seems to have a problem uploading the ISOs. Note that it creates the VM on the vSAN datastore fine and even creates the volume store on it correctly. It also seems to have correctly created a folder on the datastore that's the intended target for the ISOs.

The error I get (with debug logging) is:

May  6 2017 00:12:25.743Z INFO  Uploading images for container
May  6 2017 00:12:25.743Z INFO      "/vic/bootstrap.iso"
May  6 2017 00:12:25.743Z INFO      "/vic/appliance.iso"
May  6 2017 00:12:26.284Z ERROR         Upload failed for "/vic/appliance.iso": Put https://bcorrie-test6.eng.vmware.com/folder/e6140d59-1680-bf49-01a8-685b35a89779/V1.1.0-9852-E974A51-appliance.iso?dcPath=%2FDatacenter&dsName=vsanDatastore: EOF
May  6 2017 00:12:26.295Z ERROR         Upload failed for "/vic/bootstrap.iso": 500 Internal Server Error
May  6 2017 00:12:26.295Z DEBUG [ END ] [github.com/vmware/vic/lib/install/management.(*Dispatcher).uploadImages:102] [551.879918ms] 
May  6 2017 00:12:26.295Z DEBUG [ END ] [github.com/vmware/vic/lib/install/management.(*Dispatcher).CreateVCH:33] [3.65647771s] dev-cert
May  6 2017 00:12:26.295Z DEBUG [BEGIN] [github.com/vmware/vic/lib/install/management.(*Dispatcher).CollectDiagnosticLogs:224]
May  6 2017 00:12:26.295Z INFO  Collecting 86558c30-40eb-4c3c-bd26-751644d506b0 vpxd.log
May  6 2017 00:12:26.334Z DEBUG [ END ] [github.com/vmware/vic/lib/install/management.(*Dispatcher).CollectDiagnosticLogs:224] [39.554886ms] 
May  6 2017 00:12:26.334Z ERROR Uploading images failed with Put https://bcorrie-test6.eng.vmware.com/folder/e6140d59-1680-bf49-01a8-685b35a89779/V1.1.0-9852-E974A51-appliance.iso?dcPath=%2FDatacenter&dsName=vsanDatastore: EOF. Exiting...
May  6 2017 00:12:26.334Z ERROR --------------------
May  6 2017 00:12:26.334Z ERROR vic-machine-linux create failed: Uploading images failed with Put https://bcorrie-test6.eng.vmware.com/folder/e6140d59-1680-bf49-01a8-685b35a89779/V1.1.0-9852-E974A51-appliance.iso?dcPath=%2FDatacenter&dsName=vsanDatastore: EOF. Exiting...
corrieb commented 7 years ago

Archive.zip

corrieb commented 7 years ago

I've made this a P0 high-priority. Not being able to install the product to the latest vSphere 6.5 is a significant problem.

matthewavery commented 7 years ago

for reference: from the vpxd.log supplied the located PUT calls that signal the iso uploads are as follows

2017-05-06T00:28:17.602Z info vpxd[7F01900A0700] [Originator@6876 sub=HTTP server /folder req=00007f01703df0c0 user=VSPHERE.LOCAL\Administrator] Got HTTP PUT request for /folder/33160d59-c6a5-bf6e-54be-0c4de9b4d67c/V1.1.0-9852-E974A51-bootstrap.iso?dcPath=/Datacenter&dsName=vsanDatastore
2017-05-06T00:28:17.603Z info vpxd[7F0190EBC700] [Originator@6876 sub=vpxLro opID=req=00007f01703df0c0-9] [VpxLRO] -- BEGIN lro-18811 -- SearchIndex -- vim.SearchIndex.findByInventoryPath -- 527c44f0-300e-0e6f-99a6-9af24841c04e(5247bc39-fda4-3418-ccaa-8944cbfd66b3)
2017-05-06T00:28:17.610Z info vpxd[7F0190EBC700] [Originator@6876 sub=vpxLro opID=req=00007f01703df0c0-9] [VpxLRO] -- FINISH lro-18811
2017-05-06T00:28:17.611Z info vpxd[7F016AFDF700] [Originator@6876 sub=vpxLro opID=req=00007f01703df0c0-45] [VpxLRO] -- BEGIN lro-18814 -- ServiceInstance -- vim.ServiceInstance.retrieveContent -- 527c44f0-300e-0e6f-99a6-9af24841c04e(5247bc39-fda4-3418-ccaa-8944cbfd66b3)
2017-05-06T00:28:17.611Z info vpxd[7F016AFDF700] [Originator@6876 sub=vpxLro opID=req=00007f01703df0c0-45] [VpxLRO] -- FINISH lro-18814
2017-05-06T00:28:17.612Z info vpxd[7F016AFDF700] [Originator@6876 sub=vpxLro opID=req=00007f01703df0c0-d0] [VpxLRO] -- BEGIN session[527c44f0-300e-0e6f-99a6-9af24841c04e]52c05bc8-bf0e-6345-1c9e-86b35b178288 -- datastoreBrowser-datastore-19 -- vim.host.DatastoreBrowser.search -- 527c44f0-300e-0e6f-99a6-9af24841c04e(5247bc39-fda4-3418-ccaa-8944cbfd66b3)
2017-05-06T00:28:17.615Z info vpxd[7F016BE7C700] [Originator@6876 sub=HTTP server /folder req=00007f014c31ab60 user=VSPHERE.LOCAL\Administrator] Got HTTP PUT request for /folder/33160d59-c6a5-bf6e-54be-0c4de9b4d67c/V1.1.0-9852-E974A51-appliance.iso?dcPath=/Datacenter&dsName=vsanDatastore

several lines down from here we see failurs when we recieve a failure from NfcFssrvrOpen when attempting to open both the iso files:

2017-05-06T00:28:17.659Z info vpxd[7F016B264700] [Originator@6876 sub=vpxLro opID=4bc503b] [VpxLRO] -- BEGIN lro-18839 -- nfcService -- vim.NfcService.fileManagement -- 527c44f0-300e-0e6f-99a6-9af24841c04e(5247bc39-fda4-3418-ccaa-8944cbfd66b3)
2017-05-06T00:28:17.664Z info vpxd[7F016B264700] [Originator@6876 sub=vpxLro opID=4bc503b] [VpxLRO] -- FINISH lro-18839
2017-05-06T00:28:17.998Z warning vpxd[7F0190A33700] [Originator@6876 sub=Default] [NFC ERROR] NfcFssrvrProcessErrorMsg: received NFC error 16 from server: NfcFssrvrOpen: Failed to open '[vsanDatastore]33160d59-c6a5-bf6e-54be-0c4de9b4d67c/V1.1.0-9852-E974A51-bootstrap.iso'
--> 
2017-05-06T00:28:17.998Z error vpxd[7F0190A33700] [Originator@6876 sub=HTTP server /folder] NfcFssrvr_FileOpen returned 16, fileIOErr: 0
2017-05-06T00:28:18.018Z warning vpxd[7F01923E6700] [Originator@6876 sub=Default] [NFC ERROR] NfcFssrvrProcessErrorMsg: received NFC error 16 from server: NfcFssrvrOpen: Failed to open '[vsanDatastore]33160d59-c6a5-bf6e-54be-0c4de9b4d67c/V1.1.0-9852-E974A51-appliance.iso'
--> 
2017-05-06T00:28:18.018Z error vpxd[7F01923E6700] [Originator@6876 sub=HTTP server /folder] NfcFssrvr_FileOpen returned 16, fileIOErr: 0
dougm commented 7 years ago

Does upload work outside of vic?

export GOVC_DATASTORE=vsanDatastore
govc datastore.mkdir foo
date | govc datastore.upload - foo/date.txt
govc datastore.download foo/date.txt -

Regarding logs, hostd.log should have the most detail, turn up the volume:

govc find / -type h | xargs -n1 -I% govc host.option.set -host % Config.HostAgent.log.level verbose
govc find / -type h | xargs -n1 -I% govc logs -host % -log hostd
dougm commented 7 years ago

Do you have the same issue if #4920 is reverted, so the VCH name is used for the path instead of UUID?

matthewavery commented 7 years ago

Alright, upon running this the upload portion fails with a 500 Internal Server Error. I have not looked through the logs yet. I will increase the log level first. Additionally @dougm I am not sure on the #4920 question as well, though the upload is not functioning now so it is unclear whether that will be an issue just yet.

dougm commented 7 years ago

What build are you using @matthewavery ? datastore.upload works for me with build 5318154

corrieb commented 7 years ago

Datastore upload and download works for me with govc on the system I found the bug on

matthewavery commented 7 years ago

@dougm I am using govc version 0.14. I repeatedly get a 500 error.

matthewavery commented 7 years ago

creds

GOVC_PASSWORD=<redacted see mavery>
GOVC_INSECURE=1
GOVC_URL=bcorrie-test6.eng.vmware.com
GOVC_USERNAME=Administrator@vsphere.local
GOVC_DATASTORE=vsanDatastore
fdawg4l commented 7 years ago

I just tried MASTER on the builds mentioned on a fresh nimbus and vic-create worked without an issue on the vsanDatastore. I'm going to try 1.11 now.

dougm commented 7 years ago

Upload using the uuid instead of display name works fine too:

date | govc datastore.upload - $(govc datastore.vsan.dom.ls -l | grep foo | awk '{print $1}')/date.txt
matthewavery commented 7 years ago

upload fails for me.

Upload failed for "bin/bootstrap.iso": Put https://bcorrie-test6.eng.vmware.com/folder/223a1259-879e-2fc1-0213-0c4de9b4d67c/V1.1.0-RC3-0-F9C881A-bootstrap.iso?dcPath=%2FDatacenter&dsName=vsanDatastore: write tcp 10.17.109.98:48266->10.118.69.236:443: write: broken pipe

I did however, specify the --force option and oddly enough the vch appears to be present and powered on.

matthewavery commented 7 years ago

additionally,

matthewavery@ubuntu ~/go/src/github.com/vmware/vic
 % date | govc datastore.upload - $(govc datastore.vsan.dom.ls -l | grep foo | awk '{print $1}')/date.txt
govc: 500 Internal Server Error
matthewavery@ubuntu ~/go/src/github.com/vmware/vic
 % govc version
govc 0.14.0
fdawg4l commented 7 years ago

I just tried 1.1.1 on a fresh nimbus setup on vsan, and did not have an issue with the build numbers given.

Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 0
Server Version: v1.1.1-rc1-0-56a309f
Storage Driver: vSphere Integrated Containers v1.1.1-rc1-0-56a309f Backend Engine
VolumeStores: default
vSphere Integrated Containers v1.1.1-rc1-0-56a309f Backend Engine: RUNNING
 VCH CPU limit: 30564 MHz
 VCH memory limit: 18.28 GiB
 VCH CPU usage: 0 MHz
 VCH memory usage: 75 MiB
 VMware Product: VMware vCenter Server
 VMware OS: linux-x64
 VMware OS version: 6.5.0
Execution Driver: <not supported>
Plugins: 
 Volume: vsphere
 Network: bridge
Operating System: linux-x64
OSType: linux-x64
Architecture: x86_64
CPUs: 30564
Total Memory: 18.28 GiB
ID: vSphere Integrated Containers
Docker Root Dir: 
Debug mode (client): false
Debug mode (server): false
Registry: registry-1.docker.io
corrieb commented 7 years ago

Closing. For some reason, the vSAN cluster wasn't happy, even though it was reporting no errors. There seemed to be intermittent issues with it reporting that there was capacity of 0 bytes.

I turned off vSAN on the cluster, turned it back on... and I can now install a VCH to it.

dougm commented 7 years ago

image