bazelbuild / bazel

a fast, scalable, multi-language and extensible build system
https://bazel.build
Apache License 2.0
23.29k stars 4.09k forks source link

Subprocesses should be started from a long-lived process #16649

Open larsrc-google opened 2 years ago

larsrc-google commented 2 years ago

Description of the bug:

Currently, subprocesses are started from whatever thread might like them. The linux-sandbox system sets up PR_SET_PDEATHSIG to die when the parent (i.e. thread) dies. That has been fine until now, but with Project Loom we can't say anything about the lifetime of an OS-level thread. Additionally, using the linux-sandbox for a worker can cause that worker to die if the thread that spawned it happens to go away.

Instead, we should have a long-lived thread (non-Loomified) that handles spawning subprocesses, and have the sandboxing code make sure to kill the subprocess if the action gets interrupted. That thread can then use PR_SET_CHILD_SUBREAPER to make the child processes go away when it dies.

This came up when working on using the linux-sandbox for workers and actually enabling the integration tests for that case.

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

No response

Which operating system are you running Bazel on?

Linux

What is the output of bazel info release?

No response

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

larsrc-google commented 1 year ago

I've worked around this issue for the sandboxed worker, but this is still going to be a problem with Loom.

github-actions[bot] commented 1 month ago

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 90 days unless any other activity occurs. If you think this issue is still relevant and should stay open, please post any comment here and the issue will no longer be marked as stale.