Closed Boufcoulman closed 1 year ago
I could reproduce the same issue, the problem is in fact coming from Nova on the compute nodes:
ERROR nova.virt.libvirt.driver Failed to start libvirt guest: libvirt.libvirtError: unable to open '/sys/fs/cgroup/machine/qemu-2-instance-00000008.libvirt-qemu/': No such file or directory
In my case I'm using debian11-min as my base G5K image, so the kernel/systemd is running cgroup v2, and I think that's part of the problem. Which image are you using?
So, using debian10-min on the G5K nodes still works fine.
Debian bullseye as host is only officially supported by Kolla-ansible starting with version 12:
We are currently using Kolla-ansible version 10. I tried to manually configure the cgroupns option in Docker with this version, but I couldn't make it work.
So, next step is updating to a newer kolla-ansible! It didn't work out-of-the-box, I will debug it more in the coming weeks.
Hello,
I think I was already using debian10-min since in my reservation file I was not specifying otherwise.
I retried today enos deploy
, it worked, and I figured that the issue I was facing was when declaring instance with the default public network.
If I first create it with the default private network and allocate manually a public IP, it works !
Sorry for the inconvenience.
Ah, yes, creating ports directly on the public network is not supported! It can only be used for floating IPs.
Hi,
I'm using again enos after a few months without it, and when I deploy (tested using both no kolla-ansible pinned or with train-eol tag in the reservation.yml file), the deployement goes well, but I can't properly create an instance. When creating an instance, in the UI I get this error :
Error: Failed to perform requested operation on instance "test", the instance has an error status: Please try again later [Error: Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 3828f1c3-d7dd-4309-9119-5f3eecce826f.].
For curiosity, I checked every "nova" docker running on my nodes, and withdocker logs nova_compute
I get :and more errors in any of the other "neutron" or "nova" docker containers.
By going inside a container, I opened /var/log/kolla/neutron/neutro-server.log and the following error comes several times :
ERROR neutron.plugins.ml2.managers [req-b8ad0917-7047-426a-8e39-06896734684b 47697201dac64047a19f0f2eafd75259 b4036f17699945f7b8ab107fb88fe77b - default default] Failed to bind port 022879f1-3cc3-482f-b144-574f38d2b4f0 on host paravance-13-kavlan-4.rennes.grid5000.fr for vnic_type normal using segments [{'id': '88d45cfd-8971-4d1c-88f2-1054a819eb6f', 'network_type': 'flat', 'physical_network': 'physnet1', 'segmentation_id': None, 'network_id': 'c35dbc4d-5ba7-4c5b-bb00-317edf08cdd6'}]
For every deployment, I'm using a clean python3 virtualenv with pip and enos up to date.
Is someone still able to make it work ?