bazelbuild / bazel-buildfarm

Bazel remote caching and execution service
https://bazel.build
Apache License 2.0
635 stars 198 forks source link

Do deployments via Helm work as RBE ? #1749

Open monaka opened 1 month ago

monaka commented 1 month ago

Hello, Let me ask here.

Recent versions of Buildfarm is alwaysUseCgroups enabled. ( https://github.com/bazelbuild/bazel-buildfarm/blob/67a06b238e10a33a8e53bece0931235a0fcd3dd7/_site/docs/configuration/configuration.md?plain=1#L304 )

And AFAIK, Pods on Kubernetes aren't allowed to operate cgroups by default.

 % kubectl exec -it -n bazel-buildfarm bazel-buildfarm-shard-worker-0 -- cgexec -g cpu:/private ls
cgroup change of group failed
command terminated with exit code 82

So I think all deployments via Helm cannot run as RBE. Is my thought reasonable?

jasonschroeder-sfdc commented 1 month ago

Hi! Can you share which userId your bazel-buildfarm-shard-worker-0 is running as?

We might need to use Security Context to add additional permissions, like


    securityContext:
      capabilities:
        add: ["SETUID", "SETGID"]
monaka commented 1 month ago

Hi! Can you share which userId your bazel-buildfarm-shard-worker-0 is running as?

% kubectl exec -it -n bazel-buildfarm bazel-buildfarm-shard-worker-0 -- id
uid=0(root) gid=0(root) groups=0(root)

And the worker has setgid and setuid capability already...

## In `basel-builfarm-shard-worker-0` and after running `apt install libcap2-bin`.

root@bazel-buildfarm-shard-worker-0:/# ps -ef
UID          PID    PPID  C STIME TTY          TIME CMD
root           1       0  0 01:32 ?        00:00:02 /tini -- java -jar /app/build_buildfarm/buildfarm-shard-worker_deploy.jar --public_name=172.20.136.233:8982
root           7       1 35 01:32 ?        07:20:06 java -jar /app/build_buildfarm/buildfarm-shard-worker_deploy.jar --public_name=172.20.136.233:8982
root         517       0  0 22:07 pts/0    00:00:00 bash
root         799     517  0 22:15 pts/0    00:00:00 ps -ef
root@bazel-buildfarm-shard-worker-0:/# getpcaps 7
7: cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap=ep
monaka commented 1 month ago

Referring to the previous commit https://github.com/bazelbuild/bazel-buildfarm/commit/b7f56613d1aa5a963beb254028c1ea8e86acb17c , it can be disable cgroups in workers, I think. Does this setting combination work even now? Are there any things I should keep in mind?

worker:
  limitGlobalExecution: true
  sandboxSettings:
    alwaysUseSandbox: false
    alwaysUseCgroups: false
jasonschroeder-sfdc commented 1 month ago

To be honest, I am not running buildfarm as userid=0, I am running as non-privileged. I'll have to experiment a bit more.

monaka commented 3 weeks ago

Referring to the issue on Kubernetes https://github.com/kubernetes/kubernetes/issues/121190, all Pods can't make a nested Cgroups even if nodes supports CgroupV2 ...