Closed tqre closed 3 years ago
Hello, I am also testing a local VM with the 3.2-rc1 release of SELinux libraries and tools, and this VM boots fine with systemd-selinux 247.3
and policy crafted from selinux-refpolicy-git
. To check whether it was an issue from the policy, I reconfigured it to use selinux-refpolicy-arch
20200818-1
, and it still booted fine.
In dmesg
, I also see the same warnings (because the policy needs to be upgraded):
[ 1.099783] SELinux: Permission perfmon in class capability2 not defined in policy.
[ 1.099785] SELinux: Permission bpf in class capability2 not defined in policy.
[ 1.099786] SELinux: Permission checkpoint_restore in class capability2 not defined in policy.
[ 1.099793] SELinux: Permission perfmon in class cap2_userns not defined in policy.
[ 1.099794] SELinux: Permission bpf in class cap2_userns not defined in policy.
[ 1.099795] SELinux: Permission checkpoint_restore in class cap2_userns not defined in policy.
[ 1.099828] SELinux: Class lockdown not defined in policy.
[ 1.099829] SELinux: the above unknown classes and permissions will be denied
[ 1.103115] SELinux: policy capability network_peer_controls=1
[ 1.103116] SELinux: policy capability open_perms=1
[ 1.103117] SELinux: policy capability extended_socket_class=1
[ 1.103117] SELinux: policy capability always_check_network=0
[ 1.103118] SELinux: policy capability cgroup_seclabel=1
[ 1.103119] SELinux: policy capability nnp_nosuid_transition=1
[ 1.103119] SELinux: policy capability genfs_seclabel_symlinks=0
[ 1.163554] audit: type=1403 audit(1612693619.279:2): auid=4294967295 ses=4294967295 lsm=selinux res=1
[ 1.168936] systemd[1]: Successfully loaded SELinux policy in 153.559ms.
But these errors do not seem to be fatal.
When using the QCOW image downloaded from the GitHub Action artifacts, the main error messages are:
[ 8.583087] systemd[1]: systemd-coredump.socket: Failed to determine SELinux label: Invalid argument
[ 8.586030] systemd[1]: Failed to listen on Process Core Dump Socket.
[FAILED] Failed to listen on Process Core Dump Socket.
See 'systemctl status systemd-coredump.socket' for details.
[ 8.590747] systemd[1]: systemd-journald-audit.socket: Failed to determine SELinux label: Invalid argument
[ 8.592029] systemd[1]: Failed to listen on Journal Audit Socket.
[FAILED] Failed to listen on Journal Audit Socket.
See 'systemctl status systemd-journald-audit.socket' for details.
[ 8.594528] systemd[1]: systemd-journald-dev-log.socket: Failed to determine SELinux label: Invalid argument
[ 8.595909] systemd[1]: Failed to listen on Journal Socket (/dev/log).
[FAILED] Failed to listen on Journal Socket (/dev/log).
See 'systemctl status systemd-journald-dev-log.socket' for details.
[ 8.598028] systemd[1]: systemd-journald.socket: Failed to determine SELinux label: Invalid argument
[ 8.599316] systemd[1]: Failed to listen on Journal Socket.
[FAILED] Failed to listen on Journal Socket.
See 'systemctl status systemd-journald.socket' for details.
[DEPEND] Dependency failed for Journal Service.
[DEPEND] Dependency failed for Flus…Journal to Persistent Storage.
[ 8.604324] systemd[1]: systemd-networkd.socket: Failed to determine SELinux label: Invalid argument
[ 8.605637] systemd[1]: Failed to listen on Network Service Netlink Socket.
[FAILED] Failed to listen on Network Service Netlink Socket.
See 'systemctl status systemd-networkd.socket' for details.
...
[ 13.773757] systemd[1]: dbus.socket: Failed to determine SELinux label: Invalid argument
[ 13.775100] systemd[1]: Failed to listen on D-Bus System Message Bus Socket.
I do not know (yet) what causes this, but an Arch Linux system without D-Bus is a broken one :'(
By the way, the system is not completely broken: running qemu-system-x86_64 archselinux.qcow2 -net nic -net user,hostfwd=tcp::10022-:22 -m 2048
(without -nographic
) "works" in the meaning that I can log in as root
. Then, D-Bus is still broken, there is no journal (logs are in dmesg
...), but this is better than nothing, to debug the issue.
Disabling SELinux from kernel command line enables booting, so there is something SELinux and systemd do that don't go together. And you are right, the errors I picked up are not fatal.
It looks like none of the sockets are found, and some other failures in there too. Here is a complete startup log with all the logs I could enable: SELinux_systemd_debug.log
It seems to be a kernel issue: downgrading to linux 5.10.6-1
and rebooting fixes the issue, in the VM.
In the "buggy VM", cat /proc/self/attr/current
does not work and returns Invalid argument
. This is likely a side-effect of recent changes in Arch Linux's kernel (such as https://github.com/archlinux/svntogit-packages/commit/69cb8c2d2884181e799e67b09d67fcf7944d8408)
I downgraded a bare-metal testing laptop's kernel to 5.10.6-1, and it indeed works. As SELinux 3.2 has no issues, this issue should go away as soon as we have that version available. I'll see if I can put together the rc2 packages.
Using packages from the Arch Linux Archive, I got that :
linux
package versions 5.10.6.arch1-1, 5.10.9.arch1-1, 5.10.10.arch1-1, 5.10.11.arch1-1 and 5.10.12.arch1-1 works finelinux
package versions 5.10.13.arch1-1 or 5.10.13.arch1-2 does not work (with cat /proc/self/attr/current
displaying cat: /proc/self/attr/current: Invalid argument
)So it is definitively a regression from linux
package, and https://github.com/archlinux/svntogit-packages/commit/69cb8c2d2884181e799e67b09d67fcf7944d8408 seems very suspicious. Maybe the new CONFIG_LSM="lockdown,yama,bpf"
conflicts with SELinux and this could be overridden in the command line.
Anyway I will not have more time to investigate this issue today, so feel free to continue searching for a fix or to open bug reports on Arch Linux's bug tracker.
I found it! I looked at what CONFIG_LSM does, and it indeed is the key here.
The kernel command parameter security
has been deprecated, and lsm=selinux
should be used!
https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html
This works on my VM's and my testing laptop. I'll correct the parameter from the workflow file. I think ArchWiki needs an update regarding this too!
Great! Thanks for finding this!
I do not have a test system available right now, but I am wondering: should the lsm
kernel parameter be lsm=selinux
or lsm=selinux,lockdown,yama,bpf
or something else (maybe in a different order)? What does cat /sys/kernel/security/lsm
show on the test system?
And the Vagrant configuration (https://github.com/archlinuxhardened/selinux/blob/master/_vagrant/step1_install_and_configure.sh#L72-L88) and the wiki page (https://wiki.archlinux.org/index.php/SELinux) will also need to be upgraded accordingly. I can do this, probably in 2-3 days.
cat /sys/kernel/security/lsm
shows capability,selinux
now. If I understood it right, this is the order in which the lsm bound modules are processed. On a regular Arch it shows capability,lockdown,yama,bpf
.
On a test system, I changed the kernel parameter to lsm=selinux,lockdown,yama,bpf
, the system boots fine, and cat /sys/kernel/security/lsm
shows capability,selinux,lockdown,yama,bpf
.
I'll go ahead and put these settings on to the testing VM.
There is something strange in your parameter: using lsm=selinux,lockdown,yama,bpf
breaks the documentation (https://www.kernel.org/doc/html/v5.11-rc7/admin-guide/LSM/index.html):
A list of the active security modules can be found by reading
/sys/kernel/security/lsm
. This is a comma separated list, and will always include the capability module. The list reflects the order in which checks are made. The capability module will always be first, followed by any “minor” modules (e.g. Yama) and then the one “major” module (e.g. SELinux) if there is one configured.
I asked the selinux (https://lore.kernel.org/selinux/CAJfZ7=nKqT7mmE73r1K3YjBak=OmPACmDi5ccX=SzKhT9=vJ-g@mail.gmail.com/) and the LSM (https://lore.kernel.org/linux-security-module/CAJfZ7=nKqT7mmE73r1K3YjBak=OmPACmDi5ccX=SzKhT9=vJ-g@mail.gmail.com/) mailing lists about this and in the mean time will test whether lsm=lockdown,yama,bpf,selinux
would work.
Test result:
lsm=lockdown,yama,bpf,selinux
does not work (cat /proc/self/attr/current
reports Invalid argument
).lsm=lockdown,yama,selinux,bpf
works fine.I prefer using lsm=lockdown,yama,selinux,bpf
instead of lsm=selinux,lockdown,yama,bpf
in order to stick more closely to the documentation.
I tested local builds as the GH Actions pipeline failed, here are some logs I managed to pull: