netblue30 / firejail

Linux namespaces and seccomp-bpf sandbox
https://firejail.wordpress.com
GNU General Public License v2.0
5.71k stars 559 forks source link

Crashes with an AMD GPU with Mesa >= 19.3.4 and seccomp #3219

Closed creideiki closed 3 years ago

creideiki commented 4 years ago

Since not long ago (I unfortunately don't have exact dates or versions for when it happened, but I think it started with Firefox 72.0.2), Firefox hangs at startup under Firejail. This happens on two machines with AMD GPUs, but not on three others with Intel GPUs. All five systems are running up-to-date Gentoo Linux unstable.

Trying on a completely empty profile directory, Firefox gets a little bit through its startup:

Reading profile /etc/firejail/firefox.profile
Reading profile /etc/firejail/whitelist-usr-share-common.inc
Reading profile /etc/firejail/firefox-common.profile
Reading profile /etc/firejail/disable-common.inc
Reading profile /etc/firejail/disable-devel.inc
Reading profile /etc/firejail/disable-exec.inc
Reading profile /etc/firejail/disable-interpreters.inc
Reading profile /etc/firejail/disable-programs.inc
Reading profile /etc/firejail/whitelist-common.inc
Reading profile /etc/firejail/whitelist-var-common.inc
Warning: noroot option is not available
Parent pid 12102, child pid 12103
Warning: An abstract unix socket for session D-BUS might still be available. Use --net or remove unix from --protocol set.
Post-exec seccomp protector enabled
Seccomp list in: !chroot, check list: @default-keep, prelist: unknown,
Child process initialized in 117.95 ms
1581450106377   addons.webextension.doh-rollout@mozilla.org     WARN    Loading extension 'doh-rollout@mozilla.org': Reading manifest: Invalid extension permission: networkStatus

And then hangs. The process tree in the sandbox looks like this:

 ~ $ ps -A --forest -o pid,comm
  PID COMMAND
    1 firejail
    9 firefox
   57  \_ GPU Process

And all threads in the GPU process are hung:

 ~ # strace -f -p 12160
strace: Process 12160 attached with 3 threads
[pid 12163] futex(0x7f54e41feb78, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12162] epoll_wait(6,  <unfinished ...>
[pid 12160] restart_syscall(<... resuming interrupted read ...>^Cstrace: Process 12160 detached

As are the ones in the main Firefox process:

 ~ # strace -f -p 12112
strace: Process 12112 attached with 41 threads
[pid 12178] futex(0x7f6e45c41df8, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12177] futex(0x7f6e45c41df8, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12176] futex(0x7f6e45c41df8, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12175] futex(0x7f6e45c41df8, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12174] futex(0x7f6e45c416f0, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12173] futex(0x7f6e45c416f0, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12172] futex(0x7f6e45c416f0, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12171] futex(0x7f6e45c416f0, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12170] futex(0x7f6e45c416f0, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12169] futex(0x7f6e45c416f0, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12168] futex(0x7f6e3ed7e228, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12167] futex(0x7f6e3ed7e228, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12166] futex(0x7f6e3ed7e228, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12165] futex(0x7f6e3ed7e228, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12164] futex(0x7f6e3df29cf8, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12159] futex(0x7f6e3f0f522c, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12158] futex(0x7f6e3f0f5188, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12155] futex(0x7f6e3f0f41e8, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12154] futex(0x7f6e3f0f4148, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12153] futex(0x7f6e3f9f9e08, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12152] futex(0x7f6e3f9f922c, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12151] futex(0x7f6e3f9f8fa8, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12150] futex(0x7f6e45b4f04c, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12149] futex(0x7f6e3f9f8b48, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12148] futex(0x7f6e45c8d90c, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12147] futex(0x7f6e45d6e6d8, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12129] futex(0x7f6e41a0464c, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12128] futex(0x7f6e41a0464c, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12127] futex(0x7f6e41a0464c, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12126] futex(0x7f6e41a0464c, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12125] futex(0x7f6e41a04648, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12124] futex(0x7f6e41a0464c, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12123] futex(0x7f6e41a0464c, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12122] futex(0x7f6e41a0464c, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12121] futex(0x7f6e41856a70, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12119] restart_syscall(<... resuming interrupted read ...> <unfinished ...>
[pid 12118] restart_syscall(<... resuming interrupted read ...> <unfinished ...>
[pid 12117] futex(0x7f6e518560e0, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 12116] epoll_wait(9,  <unfinished ...>
[pid 12115] restart_syscall(<... resuming interrupted read ...> <unfinished ...>
[pid 12112] futex(0x7f6e3ea42140, FUTEX_WAIT_PRIVATE, 0, NULL^Cstrace: Process 12112 detached
 <detached ...>

I'm not going to be able to do any deeper debugging for the next couple of days, but if nobody else can reproduce it I'll start looking at older versions of Firefox and removing Firejail profile options this weekend.

firejail version 0.9.62

Compile time support:
        - AppArmor support is disabled
        - AppImage support is enabled
        - chroot support is enabled
        - file and directory whitelisting support is enabled
        - file transfer support is enabled
        - firetunnel support is disabled
        - networking support is enabled
        - overlayfs support is enabled
        - private-home support is enabled
        - seccomp-bpf support is enabled
        - user namespace support is enabled
        - X11 sandboxing support is disabled

sys-apps/firejail-0.9.62::gentoo was built with the following:
USE="-apparmor chroot -contrib -debug file-transfer globalcfg network overlayfs private-home seccomp suid -test userns -vim-syntax whitelist -x11" ABI_X86="(64)"
creideiki commented 4 years ago

firejail --noprofile firefox works.

creideiki commented 4 years ago

Firefox 73.0 hangs in the same way.

Vincent43 commented 4 years ago

Please try with various --ignore flags like firejail --ignore=seccomp firefox, firejail --ignore=nogroups firefox, firejail --ignore=nonewprivs firefox and so on. You may also try multiple --ignore at once.

leogx9r commented 4 years ago

Please try with various --ignore flags like firejail --ignore=seccomp firefox, firejail --ignore=nogroups firefox, firejail --ignore=nonewprivs firefox and so on. You may also try multiple --ignore at once.

I've confirmed that disabling seccomp via firejail --ignore=seccomp firefox successfully restores the old (intended) behavior.

Edit: Tested this with v73.0

Vincent43 commented 4 years ago

@leogx9r you had same issue as OP related to amd gpu on gentoo?

leogx9r commented 4 years ago

@leogx9r you had same issue as OP related to amd gpu on gentoo?

Same issue, different OS. I've experienced this on Arch Linux.

It works fine with NVIDIA GPUs so I'd imagine it may be a kernel bug or an updated package causing it as it only started happening recently, as in within the past week to two.

To further add on to this issue, when using seccomp, startup takes around 10-15 seconds on an SSD in contrast to just under 2 seconds and GPU compositing fails, falling back to basic CPU rendering.

You can check this via about:support -> Graphics -> Features -> Compositing: "Basic" instead of using WebRender.

Ropid commented 4 years ago

This behavior showed up for me in Arch right now with Mesa 19.3.4. It works fine if I downgrade Mesa packages to 19.3.3, so I'm thinking the problem is related to a change in Mesa 19.3.4.

creideiki commented 4 years ago

On 2020-02-14 12:57, Ropid wrote:

This behavior showed up for me in Arch right now with Mesa 19.3.4. It works fine if I downgrade Mesa packages to 19.3.3, so I'm thinking the problem is related to a change in Mesa 19.3.4.

Good catch! I don't have one of my failing systems with me at the moment, but glancing at the Mesa Git repo between 19.3.3 and 19.3.4 shows me https://gitlab.freedesktop.org/mesa/mesa/commit/ed271a9c2f40f8ec881bf3e4568d35dbfcd9cf70 which introduced a call to kcmp, which looks to be blocked by the default seccomp rules:

$ firejail --seccomp.print=4012
[...]
  002d: 15 1b 00 00000138   jeq kcmp 0049 (false 002e)
[...]
  0049: 06 00 01 00000000   ret KILL

Does it work on Mesa 19.3.4 if you start Firefox with firejail --ignore=seccomp '--seccomp=!kcmp,!chroot' firefox?

rusty-snake commented 4 years ago

Yes, it is blocked: https://github.com/netblue30/firejail/blob/master/etc/templates/syscalls.txt#L36

f you start Firefox with firejail --ignore=seccomp '--seccomp=!kcmp,!chroot' firefox?

I don't think that this works, firejail '--seccomp=!kcmp' firefox should be enough to add the exception.

creideiki commented 4 years ago

On 2020-02-14 13:47, rusty-snake wrote:

Yes, it is blocked: https://github.com/netblue30/firejail/blob/master/etc/templates/syscalls.txt#L36

f you start Firefox with firejail --ignore=seccomp '--seccomp=!kcmp,!chroot' firefox?

I don't think that this works, firejail '--seccomp=!kcmp' firefox should be enough to add the exception.

I tried that, but that still uses the seccomp rules from the profile:

$ firejail '--seccomp=!kcmp' --profile=firefox bash
Reading profile /etc/firejail/firefox.profile
[...]
Seccomp list in: !chroot, check list: @default-keep, prelist: unknown,

And firejail --seccomp.print=$PID claims it's still blocked.

Compare with this:

$ firejail --ignore=seccomp '--seccomp=!kcmp,!chroot' --profile=firefox 
bash
[...]
Seccomp list in: !kcmp,!chroot, check list: @default-keep, prelist: 
unknown,unknown,
$ firejail --seccomp.print=10931
[...]
  0007: 15 00 01 00000138   jeq kcmp 0008 (false 0009)
  0008: 06 00 00 7fff0000   ret ALLOW
leogx9r commented 4 years ago

firejail --ignore=seccomp '--seccomp=!kcmp,!chroot' firefox?

This indeed fixes the issue with the latest mesa version.

creideiki commented 4 years ago

I can confirm that firejail --ignore=seccomp '--seccomp=!kcmp,!chroot' firefox fixes my original problem as well, on Mesa 20.0.0_rc2.

creideiki commented 4 years ago

Looking at the Mesa source code, only the AMDGPU code calls kcmp() as of version 20.0.0. I'm not sure under what circumstances, though - I've tried some OpenGL games and applications, and the only one (besides Firefox) I've seen call kcmp() is VLC.

Would the best way forward be inserting seccomp !kcmp in any profiles where it is an actual problem, or removing kcmp from the default list of blocked syscalls?

rusty-snake commented 4 years ago

Possible all profiles without no3d are affected?

find no no3d ```python # Copyright © 2020 rusty-snake # # Permission to use, copy, modify, and distribute this software for any # purpose with or without fee is hereby granted, provided that the above # copyright notice and this permission notice appear in all copies. # # THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES # WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF # MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR # ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES # WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN # ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF # OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. from os import listdir, readlink for prg in listdir("/usr/local/bin"): if readlink("/usr/local/bin/" + prg) == "/usr/bin/firejail": with open(f"/etc/firejail/{prg}.profile") as prf: profile = list(prf) if "no3d\n" in profile: continue elif "# Redirect\n" in profile: if profile[-1][:7] != "include": print("WARN: cound not find included profile for {prg}.profile") with open("/etc/firejail/" + profile[-1][8:-1]) as fd: no3d = False for line in fd: if line == "no3d\n": no3d = True if not no3d: print(f"no no3d in {prg}") else: print(f"no no3d in {prg}") ```
rusty-snake commented 4 years ago

FYI: #3267

SkewedZeppelin commented 4 years ago

I can reproduce this with many profiles under Fedora 32, which ships Mesa 20.0.1.

SkewedZeppelin commented 4 years ago

Here is a hacky patch to use in the meantime https://gist.github.com/SkewedZeppelin/300447ea70be8aef106b8d8602881134 A proper solution will need to be put in place @smitsohu @topimiettinen

topimiettinen commented 4 years ago

Instead of allowing kcmp(), would it work to make it return ENOSYS (or EPERM) instead? Manual page mentions that kcmp() is not always available (needs CONFIG_CHECKPOINT_RESTORE), so the drivers should handle that case.

Though if kcmp() is considered safe (comparison of resources of two processes owned by the same user does not seem very dangerous), I wouldn't mind if it was removed.

Vincent43 commented 4 years ago

Instead of allowing kcmp(), would it work to make it return ENOSYS (or EPERM) instead?

It would be great to change seccomp filter to use EPERM/ENOSYS globally. I think KILL was proven unsustainable at this point and security difference is quite negligible. Moreover if we're going to allow syscalls that cause issues then KILL is less secure in the end.

creideiki commented 4 years ago

Instead of allowing kcmp(), would it work to make it return ENOSYS (or EPERM) instead?

The problem with that is that the call was introduced to fix a memory leak: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3202

Manual page mentions that kcmp() is not always available (needs CONFIG_CHECKPOINT_RESTORE), so the drivers should handle that case.

Yes, this makes Mesa's use of it very weird, but I haven't had the time to raise that issue with them.

topimiettinen commented 4 years ago

It would be great to change seccomp filter to use EPERM/ENOSYS globally. I think KILL was proven unsustainable at this point and security difference is quite negligible. Moreover if we're going to allow syscalls that cause issues then KILL is less secure in the end.

Agreed, also systemd has made the change. I'll make a PR.

topimiettinen commented 4 years ago

See #3301.

rusty-snake commented 4 years ago

We can remove the kcmp exception from profiles again, since we now EPERM instead of KILL programs.

glitsj16 commented 4 years ago

@rusty-snake I count only 3 profiles that currently use !kcmp: dnscrypt-proxy, steam and unbound. I'll test dnscrypt-proxy and unbound but I have never used steam. Can you test that please?

rusty-snake commented 4 years ago

I'm not using it either.

SkewedZeppelin commented 4 years ago

Even with EPERM this is not fixed. Vanilla firejail at 821dd6c9 on Fedora 32 using AMDGPU graphics breaks many programs. Firefox, Evolution, etc.

I am using https://gist.github.com/SkewedZeppelin/300447ea70be8aef106b8d8602881134 on my personal builds

kmk3 commented 3 years ago

Even with EPERM this is not fixed. Vanilla firejail at 821dd6c on Fedora 32 using AMDGPU graphics breaks many programs. Firefox, Evolution, etc.

I am using https://gist.github.com/SkewedZeppelin/300447ea70be8aef106b8d8602881134 on my personal builds

I can confirm. Firejail 0.9.64 on Artix using AMDGPU breaks Steam (see #3267) unless I override the default syscall whitelist with the syscall blacklist suggested by @rusty-snake:

--seccomp.drop=@clock,@cpu-emulation,@debug,@module,@obsolete,@raw-io,@reboot,@swap,open_by_handle_at,name_to_handle_at,ioprio_set,ni_syscall,syslog,fanotify_init,add_key,request_key,mbind,migrate_pages,move_pages,keyctl,io_setup,io_destroy,io_getevents,io_submit,io_cancel,remap_file_pages,set_mempolicyvmsplice,umount,userfaultfd,acct,bpf,chroot,mount,nfsservctl,pivot_root,setdomainname,sethostname,umount2,vhangup

Good catch! I don't have one of my failing systems with me at the moment, but glancing at the Mesa Git repo between 19.3.3 and 19.3.4 shows me https://gitlab.freedesktop.org/mesa/mesa/commit/ed271a9c2f40f8ec881bf3e4568d35dbfcd9cf70 which introduced a call to `kcmp

For reference, this is what it looks like on 19.3.4 (it hasn't changed too much as of 20.2.1):

$ git checkout mesa-19.3.4
HEAD is now at 7a3190eb918 VERSION: bump version for 19.3.4
$ grep -Fnr kcmp src/
src/util/os_file.c:37:#include <linux/kcmp.h>
src/util/os_file.c:140:   return syscall(SYS_kcmp, pid, pid, KCMP_FILE, fd1, fd2) == 0;
$ cat src/util/os_file.c
// ...
#if defined(__linux__)
// ...
bool
os_same_file_description(int fd1, int fd2)
{
   pid_t pid = getpid();

   return syscall(SYS_kcmp, pid, pid, KCMP_FILE, fd1, fd2) == 0;
}

#else
// ...

Looking at the Mesa source code, only the AMDGPU code calls kcmp() as of version 20.0.0.

$ git checkout mesa-20.0.0
HEAD is now at 9abde3412d3 VERSION: bump for 20.0.0 release
$ grep -Fnr os_same_file_description src/
src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c:383:         if (os_same_file_description(sws_iter->fd, ws->fd)) {
src/util/os_file.h:39:os_same_file_description(int fd1, int fd2);
src/util/os_file.c:136:os_same_file_description(int fd1, int fd2)
src/util/os_file.c:155:os_same_file_description(int fd1, int fd2)

Indeed, but since 20.1.1 it seems that other drivers might also be affected:

$ git checkout mesa-20.1.1
HEAD is now at 127c2be9c53 VERSION: bump to 20.1.1 release
$ grep -Fnr os_same_file_description src/
src/gallium/drivers/iris/iris_bufmgr.c:1539:   int ret = os_same_file_description(drm_fd, bufmgr->fd);
src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c:375:         r = os_same_file_description(sws_iter->fd, ws->fd);
src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c:388:               os_log_message("amdgpu: os_same_file_description couldn't "
src/util/os_file.h:45:os_same_file_description(int fd1, int fd2);
src/util/os_file.c:140:os_same_file_description(int fd1, int fd2)
src/util/os_file.c:163:os_same_file_description(int fd1, int fd2)
src/mesa/drivers/dri/i965/brw_bufmgr.c:1642:   int ret = os_same_file_description(drm_fd, bufmgr->fd);

And the list appears to be increasing...

$ git checkout master
Already on 'master'
Your branch is up to date with 'origin/master'.
$ git log --oneline --no-decorate -n 1
483657de323 aco: use mubuf helper in select_gs_copy_shader
$ grep -Fnr os_same_file_description src/
src/gallium/drivers/iris/iris_bufmgr.c:1552:   int ret = os_same_file_description(drm_fd, bufmgr->fd);
src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c:379:         r = os_same_file_description(sws_iter->fd, ws->fd);
src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c:392:               os_log_message("amdgpu: os_same_file_description couldn't "
src/gallium/winsys/etnaviv/drm/etnaviv_drm_winsys.c:66:   ret = os_same_file_description(fd1, fd2);
src/gallium/winsys/etnaviv/drm/etnaviv_drm_winsys.c:73:         fprintf(stderr, "os_same_file_description couldn't determine if "
src/util/os_file.h:51:os_same_file_description(int fd1, int fd2);
src/util/os_file.c:189:os_same_file_description(int fd1, int fd2)
src/util/os_file.c:212:os_same_file_description(int fd1, int fd2)
src/mesa/drivers/dri/i965/brw_bufmgr.c:1638:   int ret = os_same_file_description(drm_fd, bufmgr->fd);

Has anyone tested seccomp on i965/iris with Mesa >= 20.1.1?

rusty-snake commented 3 years ago

What about adding !kcmp to seccomp if no arg_no3d and a AMD-GPU is detected. no3d comes before seccomp in profiles.

topimiettinen commented 3 years ago

Wouldn't it be simpler to skip detecting AMD GPU and allow kcmp if there's no no3d, or just always allow kcmp? It can be added manually to profiles for extra hardening.

rusty-snake commented 3 years ago

just always allow kcmp?

Since I know that kcmp is used in chromiums ozone backend (wayland), what would be the drawback on this?

smitsohu commented 3 years ago

Maybe there could be an option in firejail.config to automatically append syscalls to the default seccomp filter?

This way people could easily return to the current behaviour.

rusty-snake commented 3 years ago

Over year and we don't even have a hotfix ...