Open praiskup opened 9 months ago
This may happen in two situations:
ps aux
-> I'm not sure why/how this can happenA similar thing happens when deleting OpenStack instances, from time to time, after (not 100% this is triggering the problem)
Traceback (most recent call last):
File "/usr/lib/python3.12/site-packages/resalloc_openstack/helpers.py", line 74, in best_effort_delete
self.delete()
File "/usr/lib/python3.12/site-packages/resalloc_openstack/helpers.py", line 184, in delete
self.nova_o.detach()
File "/usr/lib/python3.12/site-packages/cinderclient/v3/volumes_base.py", line 69, in detach
return self.manager.detach(self)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/site-packages/cinderclient/v3/volumes_base.py", line 285, in detach
return self._action('os-detach', volume,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/site-packages/cinderclient/v3/volumes_base.py", line 257, in _action
resp, body = self.api.client.post(url, body=body)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/site-packages/cinderclient/client.py", line 223, in post
return self._cs_request(url, 'POST', **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/site-packages/cinderclient/client.py", line 211, in _cs_request
return self.request(url, method, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/site-packages/cinderclient/client.py", line 197, in request
raise exceptions.from_response(resp, body)
cinderclient.exceptions.ClientException: The server has either erred or is incapable of performing the requested operation. (HTTP 500) (Request-ID: req-1e419934-999b-4256-a07e-a6d5e369b9c5)
failed to delete in #1 attempt
No, that would be different, I'm not sure what happened, starting of the instance in DELETING state failed:
+ ansible-playbook init.yml -i 10.0.150.201,
ERROR! the playbook: init.yml could not be found
running cleanup
cleaning 05_copr_vm_production_psi_os_00544952_20240229_172705_1
cleaning 10_server
deleting server 9e95963e-642b-41f3-b771-82411eba2386
Traceback (most recent call last):
File "/usr/bin/resalloc-openstack-new", line 22, in <module>
main()
File "/usr/lib/python3.12/site-packages/resalloc_openstack/new/main.py", line 131, in main
check_call(args.command, env=env, shell=True, stdin=DEVNULL)
File "/usr/lib64/python3.12/subprocess.py", line 413, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'set -x ; ansible-playbook init.yml -i "$RESALLOC_OS_IP," >&2 </dev/null' returned non-zero exit status 1.
... probably stayed in STARTING becuase of this bug. Then I restarted resalloc, and it stayed in DELETING state after:
=== /var/log/resallocserver/hooks/544952_terminate ===
initializing <class 'resalloc_openstack.helpers.Server'>
vm copr_vm_production_psi_os_00544952_20240229_172705 not found
initializing <class 'resalloc_openstack.helpers.Server'>
vm copr_vm_production_psi_os_00544952_20240229_172705 not found
initializing <class 'resalloc_openstack.helpers.Server'>
vm copr_vm_production_psi_os_00544952_20240229_172705 not found
These machines are STARTING for multiple days. The fact that the allocator failed should be detected.