Open BuildStream-Migration-Bot opened 3 years ago
In GitLab by [Gitlab user @lle-bout] on Jun 1, 2020, 07:16
changed the description
In GitLab by [Gitlab user @lle-bout] on Jun 1, 2020, 07:17
changed the description
In GitLab by [Gitlab user @lle-bout] on Jun 1, 2020, 07:17
changed the description
In GitLab by [Gitlab user @tristanvb] on Jun 1, 2020, 08:43
marked this issue as related to #185
In GitLab by [Gitlab user @tristanvb] on Jun 1, 2020, 08:43
marked this issue as related to #633
In GitLab by [Gitlab user @tristanvb] on Jun 1, 2020, 09:06
The current approach is the max-jobs user configuration option and accompanying --max-jobs
command line option which is exported as a hint to BuildElement implementations which can in turn communicate this to their build systems (the make
element uses this to set -j %{max-jobs}
).
This sort of thing has been discussed a lot, surprisingly I don't think we have a specific issue already open for this particularly :)
This is first related to #185 and #633
Off the top of my head, I can think of a couple of approaches which have mostly come up in the past.
A job server is a simple token system which distributes tokens to active jobs, where schedulers might request the job server for a token and wait for one in order to launch a job, GNU Make implements one
One might imagine a system however where BuildStream implements some standardized job server which can integrate with various BuildElement implementations which support some API, exposing a socket or such for this within the execution Sandbox
BuildStream does not have any preference for a specific build system, and it will be impossible to really support every build system, as not every build system will even have support for a job server (a tool like make
decides to support something like a make job server).
This approach to the problem is also complicated by remote execution, and might require additions to the standard REAPI to pursue at all.
Specifically in relation to #185, one has to consider not only available processing on a system but also available memory, often times we run into situations where handing out too many parallel jobs causes builds (like WebKit
for instance) to fail at various link stages due to OOM scenarios.
We've found that usually a safe assumption for a build is that you might need 2G of RAM on the system for every process you allow a build to run in parallel (of course mileage may vary but this is generally a safe bet).
Along the same line of thinking, it's possible that we allow users to make very simple attributions as to the weight of a given build (or a job in BuildStream terminology, which might consist of many parallel jobs).
In this scenario we might be able to say that a build by default requires 2 units, which might mean 4G
of ram and 2 available processors or threads, but allow users to increase or decrease the number of units required.
This approach is also interesting because of remote execution, we have a need to ensure that we don't bust resources on workers in a remote execution cloud, not sure how this would play into REAPI and related tooling like BuildGrid, BuildBox, and Bazel.
See original issue on GitLab In GitLab by [Gitlab user @lle-bout] on Jun 1, 2020, 07:15
GNU make has
-l
for limiting number of jobs according to load average. See: https://www.gnu.org/software/make/manual/html_node/Parallel.htmlGentoo's emerge has
--load-average
with a similar effect. See: https://wiki.gentoo.org/wiki/EMERGE_DEFAULT_OPTSI am thinking Buildstream could take advantage of such an option to spawn more build jobs in parallel until load average reached a configurable value.
It has the advantage of being quite trivial to implement without any changes or manual tagging of build recipes. In short, an easy performance win.