remote-android / redroid-doc

redroid (Remote-Android) is a multi-arch, GPU enabled, Android in Cloud solution. Track issues / docs here
4.28k stars 308 forks source link

Running redroid without privileged #591

Open soundofspace opened 9 months ago

soundofspace commented 9 months ago

Endgoal is to be able to run redroid in Kubernetes without privileged: true. To achieve that, I first tried running redroid docker without privileged, but without any success. Running it with privileged works fine.

Things I tried

docker run  \
    --cap-add=ALL \
    -it -v /sys:/sys -v /proc:/proc \
    --security-opt=seccomp:unconfined \
    --security-opt=apparmor:unconfined \
    --device=/dev/fuse:/dev/fuse \
    --device=/dev/ashmem:/dev/ashmem \
    -p 5555:5555 redroid/redroid:14.0.0-latest

Results in:

  --------- beginning of system
01-24 15:28:29.304    49    49 I installd: installd firing up
01-24 15:28:29.304    49    49 E cutils  : Failed to read /data/misc/installd/layout_version: No such file or directory
01-24 15:28:29.304    49    49 D installd: Assuming that device has multi-user storage layout; upgrade no longer supported
01-24 15:28:29.304    49    49 E cutils  : Failed to mkdir(/data/misc/user/0): No such file or directory
01-24 15:28:29.304    49    49 E installd: Failed to setup misc for user 0
01-24 15:28:29.305    49    49 E installd: Could not create directories; exiting.
01-24 15:28:29.371    55    55 E cutils-trace: Error opening trace file: Permission denied (13)
01-24 15:28:29.369    59    59 W hw-ProcessState: Opening '/dev/hwbinder' failed: Operation not permitted
01-24 15:28:29.388    59    59 F hw-ProcessState: Binder driver could not be opened. Terminating.
01-24 15:28:29.388    59    59 F libc    : Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 59 (android.hidl.al), pid 59 (android.hidl.al)
01-24 15:28:29.400    61    61 I crash_dump64: obtaining output fd from tombstoned, type: kDebuggerdTombstoneProto
01-24 15:28:29.401    61    61 E libc    : failed to connect to tombstoned: No such file or directory
01-24 15:28:29.401    61    61 I crash_dump64: performing dump of process 48 (target tid = 48)
01-24 15:28:29.416    51    51 V MediaUtils: physMem: 3674472448
01-24 15:28:29.416    51    51 V MediaUtils: requested limit: 134217728
01-24 15:28:29.416    51    51 I libc    : malloc_limit: Allocation limit enabled, max size 134217728 bytes
01-24 15:28:29.425    62    62 W hw-ProcessState: Opening '/dev/hwbinder' failed: Operation not permitted
01-24 15:28:29.425    62    62 F hw-ProcessState: Binder driver could not be opened. Terminating.
01-24 15:28:29.425    62    62 F libc    : Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 62 (android.hardwar), pid 62 (android.hardwar)
01-24 15:28:29.426    56    56 I wificond: wificond is starting up...
01-24 15:28:29.435    56    56 E ProcessState: Binder driver /dev/binder is unavailable. Using /dev/binder instead.
01-24 15:28:29.435    56    56 F ProcessState: Binder driver '/dev/binder' could not be opened. Terminating: Opening '/dev/binder' failed: Operation not permitted
01-24 15:28:29.436    56    56 F libc    : Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 56 (wificond), pid 56 (wificond)
01-24 15:28:29.446    45    45 F appproc : Error creating cache dir /data/dalvik-cache/x86_64 : No such file or directory
01-24 15:28:29.446    45    45 F libc    : Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 45 (app_process64), pid 45 (app_process64)
01-24 15:28:29.456    67    67 E ProcessState: Binder driver /dev/vndbinder is unavailable. Using /dev/binder instead.
01-24 15:28:29.457    67    67 F ProcessState: Binder driver '/dev/binder' could not be opened. Terminating: Opening '/dev/binder' failed: Operation not permitted
01-24 15:28:29.457    67    67 F libc    : Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 67 (android.hardwar), pid 67 (android.hardwar)
01-24 15:28:29.459    69    69 E cutils-trace: Error opening trace file: Permission denied (13)
01-24 15:28:29.464    68    68 I android.hardware.health-service.example: Starting health HAL.
01-24 15:28:29.464    68    68 I android.hardware.health-service.example: default instance initializing with healthd_config...
01-24 15:28:29.464    68    68 E ProcessState: Binder driver /dev/binder is unavailable. Using /dev/binder instead.
01-24 15:28:29.464    68    68 F ProcessState: Binder driver '/dev/binder' could not be opened. Terminating: Opening '/dev/binder' failed: Operation not permitted
01-24 15:28:29.464    68    68 F libc    : Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 68 (android.hardwar), pid 68 (android.hardwar)
01-24 15:28:29.477    71    71 I android.hardware.thermal@2.0-service-mock: Thermal HAL Service Mock 2.0 starting...
01-24 15:28:29.477    71    71 W hw-ProcessState: Opening '/dev/hwbinder' failed: Operation not permitted
01-24 15:28:29.477    71    71 F hw-ProcessState: Binder driver could not be opened. Terminating.
01-24 15:28:29.478    71    71 F libc    : Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 71 (android.hardwar), pid 71 (android.hardwar)
01-24 15:28:29.478    85    85 I crash_dump64: obtaining output fd from tombstoned, type: kDebuggerdTombstoneProto
01-24 15:28:29.478    85    85 E libc    : failed to connect to tombstoned: No such file or directory
01-24 15:28:29.478    85    85 I crash_dump64: performing dump of process 45 (target tid = 45)
01-24 15:28:29.484    74    74 E cutils-trace: Error opening trace file: Permission denied (13)
01-24 15:28:29.487    57    57 I android.hardware.media.omx@1.0-service: mediacodecservice starting
01-24 15:28:29.490    60    60 E ProcessState: Binder driver /dev/binder is unavailable. Using /dev/binder instead.
01-24 15:28:29.490    60    60 F ProcessState: Binder driver '/dev/binder' could not be opened. Terminating: Opening '/dev/binder' failed: Operation not permitted
01-24 15:28:29.490    60    60 F libc    : Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 60 (android.hardwar), pid 60 (android.hardwar)

This only works for a second when mounting sys before it crashes. When sys is not mounted starting the container fails without any output.

I also tried mount -t binder binder /dev/binderfs/ on the host and then passing it to docker without any succes:

--device=/dev/binderfs/binder:/dev/binder \
--device=/dev/binderfs/hwbinder:/dev/hwbinder \
--device=/dev/binderfs/vndbinder:/dev/vndbinder \

It must be possible based on this issue. Only difference I see is how binder is mounted, and also mounting an additional video card, which is not needed?

I also tested that binderfs should technically work with the docker setup:

docker run -it --cap-add=ALL --security-opt=apparmor:unconfined --security-opt seccomp:unconfined  ubuntu
root@000ffb8df313:/# mkdir test && mount -t binder binder test && ls test
binder  binder-control  features  hwbinder  vndbinder

I sadly lack the knowledge on how binder works to make this work, or I'm missing some other crucial device/folder that should mounted. Any help would be extremely appreciated.

System:

Distributor ID: Ubuntu
Description: Ubuntu 22.04.3 LTS
Release: 22.04
Codename: jammy
zhouziyang commented 9 months ago
docker run -d --rm \
--cap-add=ALL \
--security-opt=seccomp:unconfined \
--security-opt=apparmor:unconfined \
--device-cgroup-rule "c *:* rwm"  \
-p 5555:5555 \
redroid/redroid:14.0.0-latest \
androidboot.redroid_gpu_mode=guest

## try adjust to your needs
soundofspace commented 9 months ago

Thanks for the response, adding -v /sys:/sys to that seems todo the trick. Any idea why? Or is there a way to make it output why it failed to launch?

docker run \
--cap-add=ALL \
--security-opt=seccomp:unconfined \
--security-opt=apparmor:unconfined \
--device-cgroup-rule "c *:* rwm"  \
-v /sys:/sys \
-p 5555:5555 \
redroid/redroid:14.0.0-latest \
androidboot.redroid_gpu_mode=guest
zhouziyang commented 9 months ago

-v /sys:/sys should not be required. My test environment:

soundofspace commented 9 months ago

Hmm maybe this is something Ubuntu 22.04 specific. This is also the image used by gcloud, so might be something more restrictive there.

I was able to narrow the volume mount to /sys/fs/cgroup

docker --version
Docker version 20.10.12, build 20.10.12-0ubuntu3
soundofspace commented 9 months ago

Tested the same on Ubuntu 20.04.6 LTS without the volume mount and it does work there.

soundofspace commented 9 months ago

Testing on a fresh Ubuntu 22 vm even needs /sys/ mount /sys/fs/cgroup doesn't work here.

lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.3 LTS
Release:        22.04
Codename:       jammy
soundofspace commented 9 months ago

Some more info:

zhouziyang commented 9 months ago

Try switch back to cgroupv1

libprocessgroup: Failed to mount cgroup v2: Device or resource busy
...
libprocessgroup: Failed to make and chown /sys/fs/cgroup/uid_0: Read-only file system
soundofspace commented 9 months ago

It's indeed caused by cgroup v2, thanks for the help!

How are you getting those logs? I'm experimenting with other ways to launch while still using v2 (which is sadly a requirement). There is lots of info online about launching systemd in docker with v2 which seems very promising, but haven't been able to make it work for this usecase, logs on what it fails would be a huge help.

zhouziyang commented 9 months ago

Try with podman ... --security-opt unmask=/sys/fs/cgroup .... I'm not sure whether there is similar options for docker.

soundofspace commented 9 months ago

Will do some testing with podman, but how did you collect the logs posted in this comment. I tried using strace, but output of that is not really usable.

zhouziyang commented 9 months ago

Will do some testing with podman, but how did you collect the logs posted in this comment. I tried using strace, but output of that is not really usable.

dmesg

Take a look at https://github.com/remote-android/redroid-doc/blob/master/debug.sh , many debug instructions provider there.

soundofspace commented 9 months ago

Thanks for all the help, it's really appreciated!

I was able to make it run with this command (docker doesn't seem to support this currently)

podman run -it --cap-add=ALL --security-opt=seccomp=unconfined --security-opt=apparmor=unconfined -v /tmp/test:/sys/fs/cgroup --security-opt unmask=/sys/fs/cgroup --device-cgroup-rule 'c 238:* rwm' docker.io/redroid/redroid:14.0.0-latest sh

I added a volume mount for cgroup just to check if it wasn't unmasking the system cgroup folder. I think this is not needed, but I did it just to confirm that it works.

In my case binderfs had a major id of 238 so was also able to narrow that down too. Found by:

mkdir binder
mount -t binder binder binder
ls -al binder/
crw------- 1 root root 238,  9 Jan 31 13:08 binder

Haven't experimented with narrowing capabilities yet.

How to translate all of this to Kubernetes is still a WIP, and currently doesn't seem feasible on first sight (without many hacks).

For other people reading this, without the proper device-cgroup-rule to pass in binderfs, dmesg will output no logs and docker/podman will crash silently.

zhouziyang commented 9 months ago

redroid will mount it's own cgroupfs actually.