The CPI would mistakenly fail a deployment when trying to delete a VM when NSX Policy is configured BUT the VM is deployed to a regular, non-NSX VDS (Virtual Distributed Switch) port group.
The failure occurs when attempting to delete the VM from any NSX Groups (load balancer endpoints). The CPI attempts attempts to discover the VM's NSX's ID (get_vm_external_id()); however, since the VM isn't attached to an NSX segment, the VM doesn't have NSX ID, and the CPI raises an error.
This commit fixes the failure by not raising an error; instead, it prints a helpful log message: "... assuming VM is not attached to an NSX segment".
Fixes, during bosh deploy:
Unknown CPI error 'Unknown' with message 'Failed to find vm in realized state with cid: vm-xxx' in 'delete_vm' CPI method (CPI request ID: 'cpi-341326')
Side notes:
This failure would only happens during VM deletion, not during VM creation:
When adding a VM to an NSX Group, the CPI knows which Group to add it to. Also, it's reasonable to assume that if you're adding a VM to an NSX Group, that VM is connected to an NSX segment (otherwise the deployment manifest is wrong)
When deleting a VM, the CPI does not know which NSX Groups, if any, the VM belongs to, so it must to do an NSX-wide search through all the groups, forcing it search for the VM's NSX ID.
get_vm_external_id() retries for ~10 seconds to find the VM's NSX ID before giving up. This was to account for possible lag between VM creation and NSX recognizing the VM & creating an ID.
Release Notes Blurb:
No longer encounters a fatal error while executing delete_vm when NSX Policy API is configured and the VM is attached to non-NSX VDS network.
The CPI would mistakenly fail a deployment when trying to delete a VM when NSX Policy is configured BUT the VM is deployed to a regular, non-NSX VDS (Virtual Distributed Switch) port group.
The failure occurs when attempting to delete the VM from any NSX Groups (load balancer endpoints). The CPI attempts attempts to discover the VM's NSX's ID (
get_vm_external_id()
); however, since the VM isn't attached to an NSX segment, the VM doesn't have NSX ID, and the CPI raises an error.This commit fixes the failure by not raising an error; instead, it prints a helpful log message: "... assuming VM is not attached to an NSX segment".
Fixes, during
bosh deploy
:Side notes:
get_vm_external_id()
retries for ~10 seconds to find the VM's NSX ID before giving up. This was to account for possible lag between VM creation and NSX recognizing the VM & creating an ID.Release Notes Blurb:
No longer encounters a fatal error while executing
delete_vm
when NSX Policy API is configured and the VM is attached to non-NSX VDS network.