Open mkutouski opened 10 months ago
I would like to add a few things I noticed.
We started seeing this behaviour, the multiple libvirtd processes, after upgrading OpenNebula from 6.6.x to 6.8.x, the behaviour described here seems to be a correct out of the box experience using libvirtd with systemd on Ubuntu 22.04 and also 24.04 when accessing libvirt as oneadmin or any other user for that matter.
If you access libvirt, with oneadmin or any user for that matter, a libvirtd process is spawned. Below an example on what you see if userX does a simple virsh list:
host4:~# ps uax|grep libvirtd
root 4007562 5.8 0.0 6575520 46688 ? Ssl jun11 3818:44 /usr/sbin/libvirtd
host4:~# sudo -u userX virsh list --all
Id Name State
--------------------
root@host4:~# ps uax|grep libvirtd
userX 2697238 49.0 0.0 1547440 27648 ? Sl 10:47 0:00 /usr/sbin/libvirtd --timeout=120
root 4007562 5.8 0.0 6575520 46688 ? Ssl jun11 3818:44 /usr/sbin/libvirtd
Though I have not deepdived into libvirt on Ubuntu to see if this is the behaviour they want it too be, it is the behaviour that comes out of the box on 22.04 and 24.04, older versions I have not checked. A small part from the config, /etc/default/libvirtd that is installed by the package libvirt-daemon-system, seems to indicate it is expected bahviour:
# The default upstream behavior is for libvirtd.service to
# start on boot, perform VM autostart and shutdown again if
# nothing was started; later on, systemd socket activation
# is used to start it again when some client app connects.
To figure out then why we are seeing 'restarts' in a 10 minute interval the perception might have better been switched to who are what is accessing libvirt on a 10 min interval as user oneadmin. From there we quickly come to the SYSTEM_HOST interval for the monitor probe. When I configure the SYSTEM_HOST interval down from 600 to less 120 in the monitord config, the process keeps running as it will not trigger the default timeout of 120 seconds.
I then guessed that pre 6.8.x sudo was always used in the probes and since 6.8.x somewhere a command is ran without sudo.
Looking into the relation of the timeout with the SYSTEM_HOST interval I suspected the cause being in the 'im/kvm-probes.d/host/system/cpu_features.sh' where the script contains:
FEATURES=$(virsh capabilities | grep '<feature name' | sed -e "s/^.*='//;s/'\/>$//" | xargs | tr ' ' ',')
This script is not present in 6.6. If anyone wants to get rid of the multiple libvirtd processes you can add a sudo entry for the command on the hypervisor and update the command in the probe to make use it. But in the end it doesn't really seem to be an issue, just a change in behaviour between 6.6 en 6.8 which got noticed because of the oneadmin user running some virsh command now. The moment you will run anything as userX or userY you will get extra libvirtd processes and see the same thing.
Hi team
We had same issue with libvirtd 8.0.0
(8.0.0-1ubuntu7.6
). There is a a new compilation of the package available on ubuntu 22.04 repos (8.0.0-1ubuntu7.10
) and apparently the release notes said something about fixing this issue. We will watch this cluster where we had to upgrade the libvirtd and share any information if issue comes back again.
Add a note to the known issues
Description libvirtd on the hypervisor restarts every 10 minutes under the user 'oneadmin,' while there is already a process running under the root user.
When the process is initiated as 'oneadmin,' the following message also appears in the syslog.
To Reproduce Install latest 6.8.x OpenNebula version (e.g. via minione) and check on hypervisor node a system logs for error messages as above.
Expected behavior No error messages should be like ones listed in that issue.
Details
Additional context https://forum.opennebula.io/t/libvirtd-starts-in-cycles-of-10-minutes/11733
Progress Status