OpenNebula / one

The open source Cloud & Edge Computing Platform bringing real freedom to your Enterprise Cloud 🚀
http://opennebula.io
Apache License 2.0
1.22k stars 478 forks source link

Handle /sbin/ipset relocation in Debian 11 #5909

Closed gbonfiglio closed 7 months ago

gbonfiglio commented 2 years ago

Description In Debian 11, the ipset command has been moved from /sbin/ipset to /usr/sbin/ipset [1][2]. This results in VM start on KVM failing with the following error:

Fri Jul 8 12:40:13 2022: DEPLOY: fw: sudo: a password is required ERROR: post: Command Error: sudo -n ipset list -name ERROR: post: ["/var/tmp/one/vnm/command.rb:67:in `block in run!'", "/var/tmp/one/vnm/command.rb:63:in `each'", "/var/tmp/one/vnm/command.rb:63:in `run!'", "/var/tmp/one/vnm/security_groups_iptables.rb:277:in `info'", "/var/tmp/one/vnm/security_groups_iptables.rb:572:in `nic_deactivate'", "/var/tmp/one/vnm/sg_driver.rb:152:in `block in deactivate'", "/var/tmp/one/vnm/vnm_driver.rb:75:in `block in process'", "/var/tmp/one/vnm/vm.rb:73:in `block in each_nic'", "/var/tmp/one/vnm/vm.rb:72:in `each'", "/var/tmp/one/vnm/vm.rb:72:in `each_nic'", "/var/tmp/one/vnm/vnm_driver.rb:73:in `process'", "/var/tmp/one/vnm/sg_driver.rb:149:in `deactivate'", "/var/tmp/one/vnm/sg_driver.rb:77:in `activate'", "/var/tmp/one/vnm/fw/post:32:in `<main>'"] ExitCode: 1

The root cause seems to be in /etc/sudoers.d/opennebula which in the following line expects ipset to be in the old location:

Cmnd_Alias ONE_NET = /sbin/ebtables, /sbin/iptables, /sbin/ip6tables, /usr/sbin/ipset, /sbin/ip link *, /sbin/ip tuntap *, /sbin/ip route *, /sbin/ip neighbour *

Updating this line solves the issue.

To Reproduce [This happened to me after an upgrade from Debian 10 to Debian 11, not sure if it affects new installs.]

Expected behavior VMs start on Debian 11 without errors

Details

Additional context N/A

[1] https://packages.debian.org/buster/amd64/ipset/filelist [2] https://packages.debian.org/bullseye/amd64/ipset/filelist

Progress Status

jaimeibar commented 1 year ago

Hi,

I have the same problem after upgrading from Ubuntu 18 to Ubuntu 20.

rsmontero commented 1 year ago

THANKS for the heads up!

rsmontero commented 1 year ago

This should be something related to the upgrade process itself. We cannot reproduce the issue in a clean debian10, debian11 and ubuntu20, ubuntu22 installations.

Note that in debia n11/10 sbin is symlink to /usr/sbin. Sudo does not complain about this. This is a debian11 environment from CI (testing ipset functionality):

Name: one-77-1-ip6-spoofing Type: hash:ip Revision: 4 Header: family inet6 hashsize 1024 maxelem 65536 Size in memory: 208 References: 1 Number of entries: 0 Members:

Name: one-77-1-102-i-nr-inet Type: hash:net,port Revision: 7 Header: family inet hashsize 1024 maxelem 65536 Size in memory: 1088 References: 1 Number of entries: 10 Members: 192.168.160.144/30,tcp:8000 192.168.160.148/31,tcp:8000 192.168.160.100/30,tcp:8000 192.168.160.192/29,tcp:8001 192.168.160.150/31,tcp:8001 192.168.160.112/28,tcp:8000 192.168.160.128/28,tcp:8000 192.168.160.160/27,tcp:8001 192.168.160.152/29,tcp:8001 192.168.160.104/29,tcp:8000

* Root file system:

root@debian11-kvm-qcow2-6-7-ktu32-1:/# ls -l total 4194368 lrwxrwxrwx 1 root root 7 Jan 24 04:23 bin -> usr/bin drwxr-xr-x 4 root root 4096 Jan 29 21:02 boot drwxr-xr-x 18 root root 3160 Jan 30 11:38 dev drwxr-xr-x 91 root root 4096 Jan 29 21:03 etc drwxr-xr-x 2 root root 4096 Dec 9 19:15 home lrwxrwxrwx 1 root root 7 Jan 24 04:23 lib -> usr/lib lrwxrwxrwx 1 root root 9 Jan 24 04:23 lib32 -> usr/lib32 lrwxrwxrwx 1 root root 9 Jan 24 04:23 lib64 -> usr/lib64 lrwxrwxrwx 1 root root 10 Jan 24 04:23 libx32 -> usr/libx32 drwx------ 2 root root 16384 Jan 24 04:23 lost+found drwxr-xr-x 2 root root 4096 Jan 24 04:23 media drwxr-xr-x 2 root root 4096 Jan 24 04:23 mnt drwxr-xr-x 2 root root 4096 Jan 24 04:23 opt dr-xr-xr-x 149 root root 0 Jan 30 11:38 proc drwx------ 7 root root 4096 Jan 30 11:57 root drwxr-xr-x 29 root root 880 Jan 30 11:58 run lrwxrwxrwx 1 root root 8 Jan 24 04:23 sbin -> usr/sbin



So could this be related to the upgrade process itself? or something specific to your installation?

For now we will consider it as works for me, and will not add any modification to sudo file for its possible security implications.
gbonfiglio commented 1 year ago

Very interesting. I didn't dig deeper as this made sense to me having noticed the change in location for this binary confirmed in the package repository itself, but now that you mention it indeed on a system upgraded to Debian 11 the symlink are missing:

root@g-lab:~# ls -lha /
total 101K
drwxr-xr-x  22 root root 4.0K Jan 24 14:45 .
drwxr-xr-x  22 root root 4.0K Jan 24 14:45 ..
-rw-------   1 root root  888 Mar 21  2021 .bash_history
drwxr-xr-x   2 root root 4.0K Dec 27 10:06 bin
drwxr-xr-x   5 root root 1.0K Jan 24 14:46 boot
drwxr-xr-x  19 root root 3.7K Jan 28 11:48 dev
drwxr-xr-x 109 root root  12K Jan 28 11:36 etc
drwxr-xr-x   3 root root 4.0K Apr 28  2018 home
lrwxrwxrwx   1 root root   31 Jan 24 14:45 initrd.img -> boot/initrd.img-5.10.0-21-amd64
lrwxrwxrwx   1 root root   31 Jan 24 14:45 initrd.img.old -> boot/initrd.img-5.10.0-20-amd64
drwxr-xr-x  19 root root 4.0K Jan 16 13:39 lib
drwxr-xr-x   2 root root 4.0K Nov 12 22:07 lib64
drwx------   2 root root  16K Apr 28  2018 lost+found
drwxr-xr-x   3 root root 4.0K Apr 28  2018 media
drwxr-xr-x   4 root root 4.0K Mar 20  2021 mnt
drwxr-xr-x   3 root root 4.0K Apr 28  2018 opt
dr-xr-xr-x 569 root root    0 Jan 28 11:47 proc
drwx------   9 root root 4.0K Jan 18 14:41 root
drwxr-xr-x  28 root root  900 Jan 30 12:26 run
drwxr-xr-x   2 root root  12K Jan 16 13:39 sbin
drwxr-xr-x   2 root root 4.0K Apr 28  2018 srv
dr-xr-xr-x  13 root root    0 Jan 28 11:47 sys
drwxrwxrwt  10 root root 4.0K Jan 30 12:24 tmp
drwxr-xr-x  11 root root 4.0K Jul  8  2022 usr
drwxr-xr-x  13 root root 4.0K May 21  2018 var
lrwxrwxrwx   1 root root   28 Jan 24 14:45 vmlinuz -> boot/vmlinuz-5.10.0-21-amd64
lrwxrwxrwx   1 root root   28 Jan 24 14:45 vmlinuz.old -> boot/vmlinuz-5.10.0-20-amd64

The idea of reinstalling this system from scratch makes me cry in pain, will surely wait for Debian 12 at least.

rsmontero commented 1 year ago

We think your solution is valid, and we will add an entry in platform notes. sudoers is always a sensible change and we want to minimize any side effect. Specially in a maintenance release that may introduce this change unnoticed .....

rsmontero commented 1 year ago

Digging a little bit more, we found out the story behind is big: https://lwn.net/Articles/890219/

gbonfiglio commented 1 year ago

Actually, thinking through this - your setup currently breaks update use cases, so this should be considered a bug in OpenNebula (on the basis that one of your configuration files makes a wrong assumption on the location of an executable). My system is pretty vanilla (only OpenNebula lives on it) and the only reason for "failure" is having been updated from Debian 9 to Debian 11 through Debian 10, and it failed hard after the last update.

There seem to be two options to resolve here:

Can you please reconsider and keep this open as a bug?

gbonfiglio commented 1 year ago

Apparently Debian is planning to enforce the move with Debian 12 (see here). This means you could suggest/imply in your docs installing usrmerge as a dependency is the right thing to do OR you could take it as a dependency directly.

Still don't know how I feel on OpenNebula's configuration files relying on compatibility symlinks though.

rsmontero commented 1 year ago

Ok, We'll consider this for next release