freebsd / poudriere

Port/Package build and test system
https://github.com/freebsd/poudriere/wiki
BSD 2-Clause "Simplified" License
379 stars 161 forks source link

Dynamic Number of Jobs Limit #1074

Open cschuber opened 10 months ago

cschuber commented 10 months ago

Prerequisites

What is your proposal?

bulk -J performs a bulk build using N jails. A thought I had today was that with a machine with 4 cores, running -J 4 or higher can overwhelm the machine, especially when not using DISABLE_MAKE_JOBS. It would be nice to be able to define a maximum load average which when exceeded, after jobs have completed, no new jails are initiated to build the next package in the queue until the load average per CPU is below a predefined level.

What is the existing behavior, if any?

A machine can become overwhelmed when MAKE_JOBS is active and -J >= cores/threads.

What is the motivation / use case for the change?

High load averages.

Did you consider any alternatives?

I did consider creating my own patch and submitting a pull request but I have other FreeBSD projects on the go that require my full attention (one in particular) at the moment.

Is this really a ports feature request?

This is purely a poudriere thing.

Example

poudriere bulk -j amd64-head -J 8::16 ... Where, the third argument in -J is the maximum load average (one might calculate it as load average * cores or threads). Alternatively we could use a -l argument or even put a variable in poudriere.conf.

vasi commented 2 months ago

This could be really hard to do! I see a few options:

  1. Don't start a build unless resource usage is low. This would be hard to get right--for example, if two large packages are in the lib-depends phase, load would look low, so poudriere would think it was safe to start a new large package.
  2. Post-hoc, control resource usage. There's a few potential ways to do this, but they all have limitations: a. Pause some builds if resource usage gets too high. Unclear if there's a safe way to do this, maybe something like SIGSTP? b. Apply resource usage limits to builds if usage gets too high. Maybe use rctl to limit all but one jobs to a single CPU? c. Apply a global jobs limit, rather than per-builder. It's probably possible to pass the private -J fd:fd option to bsdmake for each job? But it's unclear whether all sub-operations will really respect that.
bdrewery commented 2 months ago

See also #613 and #516