0-complexity / openvcloud

OpenvCloud
Other
2 stars 4 forks source link

Apply locks during virtual firewall operations #1784

Closed alichaddad closed 6 years ago

alichaddad commented 6 years ago

Detailed description

It is possible that two different processes will try to start a virtual firewall which might cause both of them to fail. This was obseerved during this test case https://github.com/0-complexity/openvcloud/blob/master/tests/ovc_master_hosted/OVC/c_advanced/maintenance_tests.py#L293. In that test case it puts a node in maintenance mode and then tries to start a virtual firewall that was on that node, this might cause the described scenario where both of these actions try to start a virtual firewall.

Locks should be introduced to prevent this behavior when starting or stopping a virtual firewall.

Relevant stacktraces

This is an example of what might happen:


Traceback (most recent call last):
~   File "/opt/jumpscale7/lib/JumpScale/grid/jumpscripts/JumpscriptFactory.py", line 176, in executeInProcess
    return True, self.module.action(*args, **kwargs)
~   File "/tmp/jumpscripts/jumpscale_vfs_create_routeros.py", line 138, in action
    % (networkid, networkidHex, e)
~ RuntimeError: Could not create VFW vm from template, network id:424:01a8
Could not execute job, error:
Traceback (most recent call last):
~   File "/opt/jumpscale7/lib/JumpScale/grid/jumpscripts/JumpscriptFactory.py", line 176, in executeInProcess
    return True, self.module.action(*args, **kwargs)
~   File "/tmp/jumpscripts/unknown_createVM.py", line 18, in action
    return createVM(xml)
~   File "/tmp/jumpscripts/unknown_createVM.py", line 13, in createVM
    dom.create()
~   File "/usr/lib/python2.7/dist-packages/libvirt.py", line 1035, in create
    if ret == -1: raise libvirtError ('virDomainCreate() failed', dom=self)
~ libvirtError: Cannot access storage file '/var/lib/libvirt/images/routeros/01a8/routeros.qcow2' (as uid:113, gid:116): No such file or directory

type/level: OPERATIONS/1
Could not execute jscript:1000003 unknown_createVM on agent:107_9
Error: Exec error procmgr jumpscr:unknown_createVM on node:107_9 <class 'libvirt.libvirtError'>: Cannot access storage file '/var/lib/libvirt/images/routeros/01a8/routeros.qcow2' (as uid:113, gid:116): No such file or directory
FastGeert commented 6 years ago

Will be solved by #1817