osism / issues

This repository is used for bug reports that are cross-project or not bound to a specific repository (or to an unknown repository).
https://www.osism.tech
1 stars 1 forks source link

nova_compute does not start after re-deploying the compute node with a new nova datadir configuration #830

Open Nils98Ar opened 10 months ago

Nils98Ar commented 10 months ago

Propably the same as in https://bugs.launchpad.net/kolla-ansible/+bug/2051011?

Replacing the nova docker volume with a bind mount of the local ssd via nova_instance_datadir_volume seems to cause the problem because after that the instance_id file which was present in the docker volume is not created again at the new location and nova_compute refuses to start.

Any idea or maybe a workaround?

2024-01-26 13:15:28.634 7 ERROR oslo_service.service [None req-64a700a1-9477-4241-86dc-16c90baf3a5a - - - - - -] Error starting thread.: nova.exception.InvalidConfiguration: No local node identity found, but this is not our first startup on this host. Refusing to start after potentially having lost that state!
2024-01-26 13:15:28.634 7 ERROR oslo_service.service Traceback (most recent call last):
2024-01-26 13:15:28.634 7 ERROR oslo_service.service   File "/var/lib/kolla/venv/lib/python3.10/site-packages/oslo_service/service.py", line 806, in run_service
2024-01-26 13:15:28.634 7 ERROR oslo_service.service     service.start()
2024-01-26 13:15:28.634 7 ERROR oslo_service.service   File "/var/lib/kolla/venv/lib/python3.10/site-packages/nova/service.py", line 162, in start
2024-01-26 13:15:28.634 7 ERROR oslo_service.service     self.manager.init_host(self.service_ref)
2024-01-26 13:15:28.634 7 ERROR oslo_service.service   File "/var/lib/kolla/venv/lib/python3.10/site-packages/nova/compute/manager.py", line 1578, in init_host
2024-01-26 13:15:28.634 7 ERROR oslo_service.service     self._ensure_existing_node_identity(service_ref)
2024-01-26 13:15:28.634 7 ERROR oslo_service.service   File "/var/lib/kolla/venv/lib/python3.10/site-packages/nova/compute/manager.py", line 1521, in _ensure_existing_node_identity
2024-01-26 13:15:28.634 7 ERROR oslo_service.service     raise exception.InvalidConfiguration(
2024-01-26 13:15:28.634 7 ERROR oslo_service.service nova.exception.InvalidConfiguration: No local node identity found, but this is not our first startup on this host. Refusing to start after potentially having lost that state!
Nils98Ar commented 10 months ago

Maybe creating the file manually inside the nova_compute container could work?

berendt commented 10 months ago

Looks like this issue, yes.

Nils98Ar commented 10 months ago

Creating the file manually on the compute node with the output of openstack hypervisor show <compute01 FQDN>-c id -f value as <ID> seems to work but I am not sure if it is a good idea:

dragon@compute01:~$ docker exec -itu root nova_compute bash -c "echo -n '<ID>' > /var/lib/nova/compute_id; chown nova:nova /var/lib/nova/compute_id;chmod 644 /var/lib/nova/compute_id"
Nils98Ar commented 10 months ago

This does also work if the old docker volume is still there:

dragon@compute01:~$ sudo mv /var/lib/docker/volumes/nova_compute/_data/compute_id <new datadir>/