When creating a VM which is placed in an NSX server pool using the NSX Policy API (e.g. a TAS Router VM), heavily-loaded vSphere environments may exceeed the timeout for discovering the VM's IP address, returning a "Did not find primary IP" error and aborting the deploy.
This commit increases the timeout 100 → 300 seconds. The longest timeout we saw in the wild was 118 seconds, so we doubled that and added padding.
Note: We don't need to worry about an over-arching BOSH Director timeout: during the create_vm, the BOSH Director has infinite patience, and relies on the CPI to manage timeouts according to Joseph Palermo.
Fixes, during bosh deploy:
Task 11148 | 07:49:11 | Creating missing vms: router/xxx (9) (00:02:41)
L Error: Unknown CPI error 'Unknown' with message 'Did not find primary IP for VM (VSphereCloud::Resources::VM (cid="vm-xxx"))' in 'create_vm' CPI method (CPI request ID: 'cpi-897383')
Special thanks to Suman Chakraborty for reporting the bug and diagnosing the cause.
When creating a VM which is placed in an NSX server pool using the NSX Policy API (e.g. a TAS Router VM), heavily-loaded vSphere environments may exceeed the timeout for discovering the VM's IP address, returning a "Did not find primary IP" error and aborting the deploy.
This commit increases the timeout 100 → 300 seconds. The longest timeout we saw in the wild was 118 seconds, so we doubled that and added padding.
Note: We don't need to worry about an over-arching BOSH Director timeout: during the
create_vm
, the BOSH Director has infinite patience, and relies on the CPI to manage timeouts according to Joseph Palermo.Fixes, during
bosh deploy
:Special thanks to Suman Chakraborty for reporting the bug and diagnosing the cause.