nimbusproject / nimbus

Nimbus - Open Source Cloud Computing Software - 100% Apache2 licensed
http://www.nimbusproject.org/
197 stars 82 forks source link

Workspace termination can failed due to a bad timing bug #119

Closed priteau closed 11 years ago

priteau commented 11 years ago

When terminating a VM, nimbus-control will query the VM status in a loop until it is gone, see https://github.com/nimbusproject/nimbus/blob/nimbus-release-2.10.1/control/src/python/workspacecontrol/main/wc_core.py#L327

However, inside lvrt_common.py:info, the VM can disappear between the call to _get_vm_by_handle (https://github.com/nimbusproject/nimbus/blob/nimbus-release-2.10.1/control/src/python/workspacecontrol/defaults/lvrt/lvrt_common.py#L239) and the call to XMLDesc (https://github.com/nimbusproject/nimbus/blob/nimbus-release-2.10.1/control/src/python/workspacecontrol/defaults/lvrt/lvrt_common.py#L250), which produces an exception:

2013-06-06 17:12:15,432 - wc_core:329 - DEBUG - checking on VM 'wrksp-2184'
2013-06-06 17:12:15,440 - lvrt_common:243 - DEBUG - found VM with name 'wrksp-2184'
2013-06-06 17:12:15,940 - wc_core:329 - DEBUG - checking on VM 'wrksp-2184'
2013-06-06 17:12:15,948 - lvrt_common:243 - DEBUG - found VM with name 'wrksp-2184'
2013-06-06 17:12:18,968 - wc_core:213 - ERROR - Issue with shutdown/destroy: libvirtError: failed Xen syscall xenDaemonDomainFetch failed to find this domain
2013-06-06 17:12:18,969 - wc_core:81 - ERROR - failed Xen syscall xenDaemonDomainFetch failed to find this domain
Traceback (most recent call last):
  File "/gpfs/software/x86_64/el5/hotel/nimbus-control/2.10.1/src/python/workspacecontrol/main/wc_core.py", line 79, in core
    _core(vm_name, action, p, c)
  File "/gpfs/software/x86_64/el5/hotel/nimbus-control/2.10.1/src/python/workspacecontrol/main/wc_core.py", line 204, in _core
    graceful_shutdown(p, c, platform, vm_name, running_vm)
  File "/gpfs/software/x86_64/el5/hotel/nimbus-control/2.10.1/src/python/workspacecontrol/main/wc_core.py", line 330, in graceful_shutdown
    running_vm = platform.info(vm_name)
  File "/soft/nimbus-control/default/src/python/workspacecontrol/defaults/lvrt/lvrt_common.py", line 250, in info
    rvm.xmldesc = vm.XMLDesc(0)
  File "/usr/lib64/python2.4/site-packages/libvirt.py", line 247, in XMLDesc
    if ret is None: raise libvirtError ('virDomainGetXMLDesc() failed', dom=self)
libvirtError: failed Xen syscall xenDaemonDomainFetch failed to find this domain