Open Prospecta opened 7 years ago
@Prospecta afraid we haven't seen that error before. You might try running BOSH_INIT_LOG_LEVEL=debug BOSH_INIT_LOG_PATH=/tmp/bosh-debug.log bosh-init deploy your-manifest.yml
and attach the debug logs to this issue. Please scrub any passwords from these logs first as they may contain sensitive info.
Thanks Cunnie,
Please check the attached logs bosh.log.zip
Thanks @Prospecta,
Some things to try:
To speed up your debugging, you can deploy a VM with no jobs rather than an entire CF. Here's part of a manifest to accomplish this (we haven't tried this ourselves, but it should work):
---
name: empty
releases:
- name: bosh-vcloud-cpi
url: https://bosh.io/d/github.com/cloudfoundry-incubator/bosh-vcloud-cpi-release?v=24
sha1: 6b223f73f3818363b6af15a7326d3894ea0c56c6
resource_pools:
- name: vms
network: private
stemcell:
url: https://bosh.io/d/stemcells/bosh-vcloud-esxi-ubuntu-trusty-go_agent?v=3262.12
sha1: 333187bc7f7e35cd714c0aa8c4f699393cdcb0c2
cloud_properties:
cpu: 2
ram: 4_096
disk: 20_000
disk_pools:
- name: disks
disk_size: 20_000
networks:
- name: private
type: manual
subnets:
- range: 10.85.57.0/24
gateway: 10.85.57.1
dns: [8.8.8.8]
cloud_properties: {name: VM Network} # <--- Replace with Network name
instance_groups:
- name: empty_vm
instances: 1
jobs: []
resource_pool: vms
persistent_disk_pool: disks
networks:
- {name: private, static_ips: [10.0.0.6]}
cloud_provider:
template: {name: vcloud_cpi, release: bosh-vcloud-cpi}
mbus: "https://mbus:mbus-password@10.0.0.6:6868"
properties:
vcd:
url: VCLOUD-URL
user: VCLOUD-USER
password: VCLOUD-PASSWORD
entities:
organization: VDC-ORGANIZATION
virtual_datacenter: VDC-NAME
vapp_catalog: bosh-catalog
media_catalog: bosh-catalog
media_storage_profile: '*'
vm_metadata_key: bosh-meta
control: {wait_max: 900}
agent: {mbus: "https://mbus:mbus-password@0.0.0.0:6868"}
blobstore: {provider: local, path: /var/vcap/micro_bosh/data/cache}
ntp: [0.pool.ntp.org, 1.pool.ntp.org]
...
Some notes:
Unable to perform this action. Contact your cloud administrator." minorErrorCode="INTERNAL_SERVER_ERROR"/>
. Typically when we see INTERNAL_SERVER_ERROR
, it indicates that the CPI is working properly but the vCloud is unable to fulfill the request (something is broken)D, [2016-09-14T00:16:02.154698 #16903] DEBUG -- : STEP VCloudCloud::Steps::ReconfigureVM
ReconfigureVM
step sets the CPU and RAM values for the VM, creates a hard disk, and attaches the NICs. Given your server-side stack trace mentions datastores, the hard disk call seems the most likely culprit.— Lyle Franklin & Brian Cunnie
Hi,
Thanks for the detailed response. To answer your questions, we are able to manually create VMs and attach disks successfully via the vCloud director UI, however running a fresh deploy with a new vApp name and no existing VMs via bosh-init we get the same error (even when trying to use the yaml that you posted).
I've had a discussion with our infrastructure team and they have mentioned vCloud director was recently updated to v8.0.1. Has the CPI been tested against this yet?
Either way we will also be raising this with VMware support to see what they come back with.
Thanks
@Prospecta
updated to v8.0.1. Has the CPI been tested against this yet?
Honestly the vCloud CPI has been mostly static for quite a while. Our testing environments are vCloud Air 5.5 and 5.6. We'd be happy to accept PRs to make the CPI compatible with newer environments, but I'm afraid the CPI team doesn't have the bandwidth to get new environments and fix this ourselves.
Hi,
So I've been engaging with VMware support this week and they suggested we switch off SDRS for the datastore cluster that Bosh is being deployed to. Apparently there is an SDRS placement issue which affects version 8.0.1 of vCloud.
I have attempted the deployment of Bosh again (with SDRS turned off) and it has run successfully.
I have just posed the question to VMware to understand the impact of turning of this feature when deploying additional VMs to the datastore cluster. I'll let you know what they come back with.
Cheers
The question of VCD versioning will need to be addressed soon. There are some more significant API changes coming up in VCD 8.2. We would happy to help create some PRs for this nearer the time. @ljfranklin are you able to share some info on you test routines? We could probably allocate some compute resource for testing. Is there a travis job or similar?
cc'ing @zaksoup & @cppforlife as I'm not sure what our current vCloud support plan looks like.
Hi,
I am encountering an error when attempting to run the bosh-init deploy script on vCloud (not vCloud Air).
The job fails when attempting to create the vm:
Command 'deploy' failed: Deploying: Creating instance 'bosh/0': Creating VM: Creating vm with stemcell cid 'urn:vcloud:catalogitem:decd3c65-5dde-47ae-836e-100199cdae0d': CPI 'create_vm' method responded with error: CmdError{"type":"Unknown","message":"Task urn:vcloud:task:14f18678-7d9a-4046-a1aa-d708c8058be0 Updated Virtual Machine 3b6b2f67-572c-44c4-74d4-a80b28887836(be1c153c-6e01-46b3-81c5-018ec2c26c10) completed unsuccessfully, Details: [ 01b88b27-f995-41dd-b39e-c2d498b808fb-3776 ] Unable to perform this action. Contact your cloud admi...","ok_to_retry":false}
and vCloud returns the error:
com.vmware.ssdc.library.exceptions.DatastoreNotAvailableException: null at com.vmware.vcloud.fabric.storage.placement.sdrs.impl.SdrsPlacementManagerImpl.processSdrsResults(SdrsPlacementManagerImpl.java:1040) at com.vmware.vcloud.fabric.storage.placement.sdrs.impl.SdrsPlacementManagerImpl.selectDatastoreInStoragePod(SdrsPlacementManagerImpl.java:213) at com.vmware.vcloud.fabric.storage.placement.impl.VirtualMachineDiskLevelStorageSelectorImpl.selectDatastore(VirtualMachineDiskLevelStorageSelectorImpl.java:369) at com.vmware.vcloud.fabric.storage.storedVm.impl.RelocateStoredVmByStorageClassActivity$CalculateTargetDatastorePhase.invoke(RelocateStoredVmByStorageClassActivity.java:138) at com.vmware.vcloud.activity.executors.ActivityRunner.runPhase(ActivityRunner.java:156) at com.vmware.vcloud.activity.executors.ActivityRunner.run(ActivityRunner.java:118) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)
The vClouderror reports that the datastore is not available when in fact there is a valid datastore available for the deployment which has no issues when creating a vm manually via vCloud Director. The stemcell is also uploaded to the datastore with no issues.
We had previously deployed Cloud Foundry v182 including Bosh with no trouble on vCloud but now struggle with the latest releases (bosh v257.9 & vCloud CPI v24). We have attempted to use both the latest vCloud and vSphere stemcells (v3262.12) with no luck.
For reference, below is the vcd section of the bosh.yml:
vcd: &vcd # <--- Replace values below url: https://*** user: ** password: **** entities: organization: Cloud-Foundry virtual_datacenter: Cloud-Foundry-VDC vapp_catalog: cf-catalog-bosh media_catalog: cf-catalog-bosh media_storage_profile: '*' vm_metadata_key: bosh-meta control: {wait_max: 900}
Any ideas what could be causing the issue?
Thanks