Open xiaoluzi0050 opened 11 months ago
I am having the same issue on sccache 0.7.6. It works fine limiting the build and tests to 20 cores, but our machine has 72 available.
Guys, do you open the issue on the tokio
package? Looks like it is the problem of the tokio runtime parallelizm.
If yes, link/bind it to the current issue, please.
Just adding that this is very easy to hit inside podman containers. The default PID limit inside a podman container is 2048. We are launching compilation on a machine in a build farm with 192 cores. This fails basically everytime.
The workaround of course is to bring up the limit, but it would be nice if an EAGAIN in fork() would be treated as such and just retried.
The default PID limit inside a podman container is 4194304, but the issue occurs sporadically,
The default PID limit inside a podman container is 4194304, but the issue occurs sporadically,
@xiaoluzi0050 Are you sure you are looking at the effective limits, and not at what is exposed in /proc
?
~$ podman run --detach docker.io/debian:bookworm-slim
e0c19c25ff564a27290149068b76f85e5117042cac3218b3b0dd48436d8dbd58
~$ podman inspect --format '{{ .HostConfig.PidsLimit }}' e0c19c25ff564a27290149068b76f85e5117042cac3218b3b0dd48436d8dbd58
2048
thread 'main' panicked at 'failed to spawn thread: Os {code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" }', src/jobserver.rs:53:18 thread 'main' panicked at 'OS can't pawn worker thread:Resource temporarily unavailable (os error 11)', /home/.cargo/registry/src/github.com/tokio-1.28.2/src/runtime/scheduler/multi_thread/worker.rs:365:13 make[2]: fork: Resource temporarily unavailable
sccache version : 0.7.1 The server has 256 cores, with a parallelism setting of 100, and then compile using Docker. The aforementioned issue occurs sporadically, and the probability of occurrence increases as the parallelism value is set higher