Segfault when running from bazel

fredr commented 2 years ago

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

Steps to reproduce the issue:

A gist for setting up and reproducing the error https://gist.github.com/fredr/dd0e5c3639fa109df82471292d6bc8c3

Download BUILD and WORKSPACE to a folder

In that folder, run:

$ bazel build //... --sandbox_writable_path=${XDG_RUNTIME_DIR} --sandbox_writable_path=${HOME}/.local/share/containers/storage --sandbox_debug

Describe the results you received:

time="2022-01-14T09:25:40+01:00" level=warning msg="\"/\" is not a shared mount, this could cause issues or missing mounts with rootless containers"
cannot setresgid: Invalid argument
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x557e3105594d]

goroutine 1 [running]:
github.com/containers/common/libimage.(*Runtime).Load(0x0, {0x557e31ddcaf8, 0xc000042028}, {0x7fffe5543de7, 0x50}, 0xc00058bb38)
    github.com/containers/common@v0.44.4/libimage/load.go:27 +0xed
github.com/containers/podman/v3/pkg/domain/infra/abi.(*ImageEngine).Load(0x7fffe5543de7, {0x557e31ddcaf8, 0xc000042028}, {{0x7fffe5543de7, 0x41}, 0x0, {0x0, 0x0}})
    github.com/containers/podman/v3/pkg/domain/infra/abi/images.go:362 +0xff
github.com/containers/podman/v3/cmd/podman/images.load(0x557e329443a0, {0xc0002d9000, 0x0, 0x2})
    github.com/containers/podman/v3/cmd/podman/images/load.go:92 +0x358
github.com/spf13/cobra.(*Command).execute(0x557e329443a0, {0xc00003c0a0, 0x2, 0x2})
    github.com/spf13/cobra@v1.2.1/command.go:856 +0x60e
github.com/spf13/cobra.(*Command).ExecuteC(0x557e32955e20)
    github.com/spf13/cobra@v1.2.1/command.go:974 +0x3bc
github.com/spf13/cobra.(*Command).Execute(...)
    github.com/spf13/cobra@v1.2.1/command.go:902
github.com/spf13/cobra.(*Command).ExecuteContext(...)
    github.com/spf13/cobra@v1.2.1/command.go:895
main.Execute()
    github.com/containers/podman/v3/cmd/podman/root.go:91 +0xbe
main.main()
    github.com/containers/podman/v3/cmd/podman/main.go:39 +0x74

Describe the results you expected: I'm guessing something in the setup is wrong, and this should trigger an error message telling me what.

Additional information you deem important (e.g. issue happens only occasionally): Bazel executes within a sandbox, and it is when executing podman from inside that sandbox that this seems to happen. If I run the generated script that fails from my terminal, it works just fine.

Output of podman version:

Version:      3.4.4
API Version:  3.4.4
Go Version:   go1.17.4
Git Commit:   f6526ada1025c2e3f88745ba83b8b461ca659933
Built:        Thu Dec  9 19:30:40 2021
OS/Arch:      linux/amd64

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.23.1
  cgroupControllers:
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: /usr/bin/conmon is owned by conmon 1:2.0.31-1
    path: /usr/bin/conmon
    version: 'conmon version 2.0.31, commit: 7e7eb74e52abf65a6d46807eeaea75425cc8a36c'
  cpus: 16
  distribution:
    distribution: manjaro
    version: unknown
  eventLogger: journald
  hostname: runner
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.15.12-1-MANJARO
  linkmode: dynamic
  logDriver: journald
  memFree: 7517573120
  memTotal: 33400438784
  ociRuntime:
    name: crun
    package: /usr/bin/crun is owned by crun 1.4-1
    path: /usr/bin/crun
    version: |-
      crun version 1.4
      commit: 3daded072ef008ef0840e8eccb0b52a7efbd165d
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /etc/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: /usr/bin/slirp4netns is owned by slirp4netns 1.1.12-1
    version: |-
      slirp4netns version 1.1.12
      commit: 7a104a101aa3278a2152351a082a6df71f57c9a3
      libslirp: 4.6.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.3
  swapFree: 0
  swapTotal: 0
  uptime: 39h 46m 19.52s (Approximately 1.62 days)
plugins:
  log:
  - k8s-file
  - none
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  docker.io:
    Blocked: false
    Insecure: false
    Location: docker.io
    MirrorByDigestOnly: false
    Mirrors: null
    Prefix: docker.io
  search:
  - docker.io
store:
  configFile: /home/fredr/.config/containers/storage.conf
  containerStore:
    number: 7
    paused: 0
    running: 0
    stopped: 7
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/fredr/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 13
  runRoot: /run/user/1000/containers
  volumePath: /home/fredr/.local/share/containers/storage/volumes
version:
  APIVersion: 3.4.4
  Built: 1639074640
  BuiltTime: Thu Dec  9 19:30:40 2021
  GitCommit: f6526ada1025c2e3f88745ba83b8b461ca659933
  GoVersion: go1.17.4
  OsArch: linux/amd64
  Version: 3.4.4

Package info (e.g. output of rpm -q podman or apt list podman):

$ pacman -Qi podman
Name            : podman
Version         : 3.4.4-1
Description     : Tool and library for running OCI-based containers in pods
Architecture    : x86_64
URL             : https://github.com/containers/podman
Licenses        : Apache
Groups          : None
Provides        : None
Depends On      : cni-plugins  conmon  containers-common  crun  fuse-overlayfs  iptables  libdevmapper.so=1.02-64  libgpgme.so=11-64  libseccomp.so=2-64  slirp4netns
Optional Deps   : apparmor: for AppArmor support [installed]
                  btrfs-progs: support btrfs backend devices [installed]
                  catatonit: --init flag support
                  podman-docker: for Docker-compatible CLI [installed]
Required By     : podman-docker
Optional For    : None
Conflicts With  : None
Replaces        : None
Installed Size  : 72,79 MiB
Packager        : David Runge <dvzrv@archlinux.org>
Build Date      : tor  9 dec 2021 19:30:40
Install Date    : tis 11 jan 2022 15:30:32
Install Reason  : Explicitly installed
Install Script  : No
Validated By    : Signature

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/master/troubleshooting.md)

Yes

Additional environment details (AWS, VirtualBox, physical, etc.):

/usr/bin/docker is symlinked to /usr/bin/podman

mheon commented 2 years ago

@vrothberg PTAL, I think this one's in libimage

rhatdan commented 2 years ago

Podman prefers the / be mounted -rshared. This could be triggering the issue.

fredr commented 2 years ago

Podman prefers the / be mounted -rshared. This could be triggering the issue.

Is this something I could test? I dont have sudo rights inside the sandbox, so I can't run sudo mount --make-rshared / from within it.

I did figure out a bit more how to debug the bazel sandbox. By adding --sandbox_debug all the sandbox files will be saved after execution, and --verbose_failures, I get the failing command, including how the sandbox is setup. So to reproduce this panic, I can run

./linux-sandbox -w /run/user/1000 -w /home/fredr/.local/share/containers/storage -w /dev/shm -D -- podman run hello-world

-w makes the folder or file writable from within the sandbox, everything else should still be readable.

If I change /run/user/1000 to be mounted as an empty tempfs dir, I no longer get the panic, but instead I get this error (not sure if that is of any use):

WARN[0000] "/" is not a shared mount, this could cause issues or missing mounts with rootless containers
cannot setresgid: Invalid argument
Error: potentially insufficient UIDs or GIDs available in user namespace (requested 0:0 for /home/fredr/.local/share/containers/storage/overlay/l): Check /etc/subuid and /etc/subgid: chown /home/fredr/.local/share/containers/storage/overlay/l: invalid argument

If I run podman ps and such commands, I do not get the panic, but I still get

cannot setresgid: Invalid argument

vrothberg commented 2 years ago

@vrothberg PTAL, I think this one's in libimage

It's a Podman-side issue. It seems we're calling the libimage runtime without having initialized it; the nil deref is on the runtime object. I guess @rhatdan is on point.

vrothberg commented 2 years ago

@fredr can you paste the contents of /etc/subuid and /etc/subgid?

fredr commented 2 years ago

@fredr can you paste the contents of /etc/subuid and /etc/subgid?

$ cat /etc/subgid /etc/subuid
fredr:100000:65536
fredr:100000:65536

I should also mention that podman works just fine outside of the sandbox

rhatdan commented 2 years ago

Can linux-sandbox be changed to expose / as mount-rshared?

BTW What is linux-sandbox?

vrothberg commented 2 years ago

In the meantime, I am going to have a look how we can prevent the segfault. Podman should error out or perform other counter measures.

vrothberg commented 2 years ago

@fredr do you have a simple reproducer?

vrothberg commented 2 years ago

Could you also rerun podman with --log-level=debug?

fredr commented 2 years ago

Can linux-sandbox be changed to expose / as mount-rshared?

Not in any way that I have been able to figure out unfortunately

BTW What is linux-sandbox?

it is part of the build tool bazel, https://docs.bazel.build/versions/main/sandboxing.html I'm quite new to it myself, but this is it: https://github.com/bazelbuild/bazel/blob/master/src/main/tools/linux-sandbox.cc

@fredr do you have a simple reproducer?

Since it relies on bazel, it wont be super simple, but in this gist: https://gist.github.com/fredr/dd0e5c3639fa109df82471292d6bc8c3

If you put BUILD and WORKSPACE in a directory and run:

bazel build //... --sandbox_writable_path=${XDG_RUNTIME_DIR} --sandbox_writable_path=${HOME}/.local/share/containers/storage --sandbox_debug --verbose_failure

Could you also rerun podman with --log-level=debug?

Here I executed podman --log-level=debug run hello-world inside the sandbox

INFO[0000] podman filtering at log level debug
DEBU[0000] Called run.PersistentPreRunE(podman --log-level=debug run hello-world)
DEBU[0000] Merged system config "/usr/share/containers/containers.conf"
DEBU[0000] Merged system config "/etc/containers/containers.conf"
DEBU[0000] Using conmon: "/usr/bin/conmon"
DEBU[0000] Initializing boltdb state at /home/fredr/.local/share/containers/storage/libpod/bolt_state.db
DEBU[0000] Using graph driver overlay
DEBU[0000] Using graph root /home/fredr/.local/share/containers/storage
DEBU[0000] Using run root /run/user/1000/containers
DEBU[0000] Using static dir /home/fredr/.local/share/containers/storage/libpod
DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp
DEBU[0000] Using volume path /home/fredr/.local/share/containers/storage/volumes
DEBU[0000] Set libpod namespace to ""
DEBU[0000] Not configuring container store
DEBU[0000] Initializing event backend file
DEBU[0000] configured OCI runtime runc initialization failed: no valid executable found for OCI runtime runc: invalid argument
DEBU[0000] configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument
DEBU[0000] configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument
DEBU[0000] Using OCI runtime "/usr/bin/crun"
INFO[0000] Found CNI network podman (type=bridge) at /home/fredr/.config/cni/net.d/87-podman.conflist
INFO[0000] Found CNI network kind (type=bridge) at /home/fredr/.config/cni/net.d/kind.conflist
DEBU[0000] Default CNI network name podman is unchangeable
INFO[0000] Setting parallel job count to 49
WARN[0000] "/" is not a shared mount, this could cause issues or missing mounts with rootless containers
cannot setresgid: Invalid argument
DEBU[0000] Pulling image hello-world (policy: missing)
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x55910c5f5bd4]

goroutine 1 [running]:
github.com/containers/common/libimage.(*Runtime).Pull(0x0, {0x55910d378b68, 0xc000714a80}, {0x7ffd0d280783, 0xb}, 0x1, 0xc00056e280)
    github.com/containers/common@v0.44.4/libimage/pull.go:117 +0x714
github.com/containers/podman/v3/pkg/domain/infra/abi.(*ImageEngine).Pull(0x55910cd0216a, {0x55910d378b68, 0xc000714a80}, {0x7ffd0d280783, 0xc00056e580}, {0x0, {0x0, 0x0}, {0x0, 0x0}, ...})
    github.com/containers/podman/v3/pkg/domain/infra/abi/images.go:231 +0x1d6
github.com/containers/podman/v3/cmd/podman/containers.PullImage({_, _}, {{0x55910dfc3b88, 0x0, 0x0}, {0x55910dfc3b88, 0x0, 0x0}, {0x0, 0x0}, ...})
    github.com/containers/podman/v3/cmd/podman/containers/create.go:300 +0x2fd
github.com/containers/podman/v3/cmd/podman/containers.run(0x55910dedb120, {0xc00030bc80, 0x1, 0x1})
    github.com/containers/podman/v3/cmd/podman/containers/run.go:143 +0x43b
github.com/spf13/cobra.(*Command).execute(0x55910dedb120, {0xc00003c0b0, 0x1, 0x1})
    github.com/spf13/cobra@v1.2.1/command.go:856 +0x60e
github.com/spf13/cobra.(*Command).ExecuteC(0x55910def1e20)
    github.com/spf13/cobra@v1.2.1/command.go:974 +0x3bc
github.com/spf13/cobra.(*Command).Execute(...)
    github.com/spf13/cobra@v1.2.1/command.go:902
github.com/spf13/cobra.(*Command).ExecuteContext(...)
    github.com/spf13/cobra@v1.2.1/command.go:895
main.Execute()
    github.com/containers/podman/v3/cmd/podman/root.go:91 +0xbe
main.main()
    github.com/containers/podman/v3/cmd/podman/main.go:39 +0x74

vrothberg commented 2 years ago

Thanks!

DEBU[0000] Not configuring container store

That seems to be it: Podman is not configuring the store if it's lacking the cap_sys_admin capability. In that case, Podman just continues but without the image runtime which explains how we run into the segfault.

@mheon @giuseppe I am totally undecided on what to do in this case though. Does a Podman without a configured store make sense?

mheon commented 2 years ago

The store-deactivation code I'm aware of was intended for performance reasons, to not require commands that would never require a store to initialize one; I think we're talking about different code here, though, because run will always need a store. Is this the code that @rhatdan added so that Podman as root could revert to pseudo-rootless functionality if CAP_SYS_ADMIN was not available? I'm not terribly familiar with it, but the concept seemed to make sense.

vrothberg commented 2 years ago

I'm referring to the following code: https://github.com/containers/podman/blob/main/libpod/runtime.go#L376-L391

I managed to build linux-sandbox locally and reproduce the issue with many commands: we don't create a store but continue.

vrothberg commented 2 years ago

One very strange thing is that capsh --print claims to have cap_sys_admin ...

vrothberg commented 2 years ago

Quick update: what works for me is to use linux-sandbox -w / -- podman .... This will just mount everything writable in the "sandbox".

fredr commented 2 years ago

Quick update: what works for me is to use linux-sandbox -w / -- podman .... This will just mount everything writable in the "sandbox".

Interesting! When I try that I get:

ERRO[0000] set sticky bit on: chmod /run/user/1000/libpod: read-only file system

mheon commented 2 years ago

Yeah, this is the rootless-when-no-sysadmin code I was talking about - https://github.com/containers/podman/commit/722ea2f1f82ff16271b50b508d709e5da275e32a

Apparently was by @giuseppe instead of @rhatdan?

giuseppe commented 2 years ago

we switched from "detect rootless" to "detect if we have CAP_SYS_ADMIN" because running with EUID=0 is not enough to perform all the operations needed to mount the storage, pull images and run containers. It is useful for example when running Podman in a container as root but without capabilities, which is somehow equivalent to run as rootless, so we need to create a user namespace to gain there the needed capabilities.

rhatdan commented 2 years ago

Is there a way to avoid the segfault? Should we check CAP_SYS_ADMIN || CAP_SETUID && CAP_SETGID, because without one of those situations, Podman is not going to work.

vrothberg commented 2 years ago

What would happen if the condition is not met? Currently, we just don't configure the store and continue but I think we should error if there is no store.

giuseppe commented 2 years ago

when we do not have enough privileges, we re-exec and gain these privileges.

We should not get that far in the parent Podman process and re-exec from SetupRootless (pkg/domain/infra/abi/system.go).

Does the re-exec fail and Podman somehow keeps going without enough privileges?

fredr commented 2 years ago

Can it be that the bazel sandbox reports having those capabilities, but it actually doesn't? and that causes the panic?

giuseppe commented 2 years ago

Can it be that the bazel sandbox reports having those capabilities, but it actually doesn't? and that causes the panic?

I don't think so, these are coming from the kernel and we read them from /proc.

rhatdan commented 2 years ago

I believe this is a discussion and not an issue with Podman transferring.

containers / podman

Segfault when running from bazel #12855