Open maralorn opened 2 years ago
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/nix-2-4-and-what-s-next/16257/1
For reference, here's the discussion that initiated the changed behavior of --max-jobs 0
in 2.4: https://github.com/NixOS/nix/issues/4442#issuecomment-758319324
I agree with @maralorn that it should be able to instruct Nix both to never build anything locally (which is what --max-jobs 0
means in 2.4), and to only build derivations with preferLocalBuild
locally (this issue). If we add a new option for the latter case, then the max-jobs
option would still make sense for controlling how many preferLocalBuild
jobs that can run in parallel.
Hello!
Is there an update on this front? We are upgrading some infrastructure from nix 2.3 to 2.4, and we are running into the issue (presumably) that preferLocalBuild
isn't honoured in combination with --max--jobs 0
. Our expected behaviour is: Build nothing locally except derivations marked by preferLocalBuild = true
.
According to this issue (or at least how I read it here), it was an intentional change with not migration path, is that still the case?
Adding an option that signals that every derivation except those marked by perferLocalBuild = true
sounds like a sensible solution to me.
@fendor Can you detail the issues you have with preferLocalBuild
derivations being sent to remote builders? It would be helpful to have clear use cases for this new option.
The details are very similar to the original issue: We have multiple build servers and not-that-beefy vm's that should not be used for executing CPU intensive tasks. To achieve that, we use max-jobs 0
. If we had something like max-jobs 1
we might run into the risk of executing CPU intensive tasks once in a while, slowing down the overall build-process.
The issue now comes when we have cheap jobs, such as bundling a lot of symbolic links together. Afaict, all derivations need to be uploaded to a single builder (which is very unlikely to have all derivation locally available), a cheap operation is performed, and then we downloaded the result again. In this case, it would have been much quicker to perform the operation locally.
Does this clarify our use-case?
How about adding a flag --prefer-remote-builders
that, when enabled, has the following semantics:
max-jobs
for the number of jobs to execute locallypreferLocalBuild
is given, build this derivation locally.I am looking into reviving the original proposal from NixOS/nix#3810
TLDR;
Concretely, I want to:
- extend the nix.conf to accept a mandatory-features option
- treat derivations where preferLocalBuild == true as if they had the "local-preferred" feature
- add local-preferred to the default value of system-features
However, I feel like this doesn't quite catch all the use-cases. In the table here:
max-jobs | lp:local | lp:remote-builder | Outcome |
---|---|---|---|
0 | x | 0 | Execute all builds remotely, fail if drv has preferLocalBuild
|
0 | x | 1 | Execute all builds remotely, build preferLocalBuild drv remotely |
auto | 0 | 0 | Execute builds remotely and locally, fail if drv has preferLocalBuild
|
auto | 0 | 1 | Execute builds remotely and locally, build preferLocalBuild remotely |
auto | 1 | 0 | Execute builds remotely and locally, build preferLocalBuild locally |
auto | 1 | 1 | Execute builds remotely and locally, build preferLocalBuild remotely and locally |
With the following caption:
max-jobs
are the numbers of job available locally. Values 0
or auto
since these are the only relevant oneslp:local
: local-preferred
feature enabled locallylp:remote-builder
: local-preferred
feature enabled on at least one remote builder.x
: "value doesn't matter"Outcome
: expected build outcome.Implicit assumption: at least one remote builder is available, otherwise everything with remote only
fails.
However, given these outcomes, I am actually missing the following:
Execute builds remotely, build `preferLocalBuild` locally
.Did I miss anything, or is that not expressible?
@fendor Perhaps if we had a "remote-preferred" (imaginary) feature we can fill out the table?
I like this idea, and shows that the idea generalises nicely.
However, the name remote-preferred
is probably not ideal as it feels a bit analogous to local-preferred
and what would that mean if a remote-server has this feature enabled?
I am not quite sure what you are asking? local-prefered
and remote-prefered
applied to machines are arbitrarily does indeed allow some silly combinations, but I am not sure how to prevent that given there are evidently so many opinions on what this feature ought to do!
Ok, I have filled out the behaviour matrix again :D
Assumptions, at least one builder with none of the two features enabled.
max-jobs | local-preferred |
remote-preferred |
Outcome |
---|---|---|---|
0 | x | x | Build remotely, fail if drv has preferLocalBuild |
auto | 0 | 0 | Build remotely and locally, fail if drv has preferLocalBuild |
auto | 0 | 1 | Build remotely, fail if drv has preferLocalBuild |
auto | 1 | 0 | Build remotely and locally, but build preferLocalBuild locally |
auto | 1 | 1 | Build remotely, but build preferLocalBuild locally |
Covers my use-case, and in combination from above, seems to be satisfactory.
What I meant is basically bike-shedding about the name. remote-preferred
feels like the opposite of local-preferred
, but it is not, the latter is a special feature for the attribute preferLocalBuild
while the former describes where derivations are built by default. remote-preferred
meaning basically: "not here if possible", while local-preferred
means "only here".
However, modulo the name, I think the feature is pretty nice, as it is now possible to specify that a remote server is only responsible for building drv that have a preferLocalBuild
attribute. (I am not sure, though, whether this is easy to implement in the code)
Looking good, are we covering all cases now?
Hmm so the way the features normally work is that if a derivation has a feature, it must be built with the builder with that feature.
I was thinking both local-preferred
and remote-preferred
would be of that nature, so we could also (literally or figuratively, not sure) set the feature on the relevant builders, but vary whether what the derivations one require.
Ah, I see, I had a slight misunderstanding of how features work. Then I still don't like the names, but the idea is just as valid, even better, more predictable.
Sure. it might just be better to call them local
and remote
. Derivations can require locality or remoteness, and likewise builders and offer locality or remoteness.
Still feels weird, since if a derivation requires the feature remote
it will still be built locally if the local machine has the remote
feature, right?
Just feels a bit off.
@fendor likewise is a derivation requires local and the remote machine has the local feature? Icky, but symmetric. No?
We also have a similar situation. We already have a pseudo-system-feature for derivations we can try building on the low-powered coordinator. Perhaps another solution could be creating a mandatory-system-features
setting to mirror the mandatoryFeatures
build machine configuration?
In case it's useful to anybody, I've been carrying the attached patch for a while which does this for Nix>2.3:
diff --git a/src/libstore/build/derivation-goal.cc b/src/libstore/build/derivation-goal.cc
index df7d21e54..2d773a4df 100644
--- a/src/libstore/build/derivation-goal.cc
+++ b/src/libstore/build/derivation-goal.cc
@@ -1083,10 +1083,14 @@ HookReply DerivationGoal::tryBuildHook()
try {
+ bool tryBuildLocally = worker.getNrLocalBuilds() < settings.maxBuildJobs;
+ if (settings.buildLocalOnlyIfPreferred && !parsedDrv->preferBuildLocally())
+ tryBuildLocally = false;
+
/* Send the request to the hook. */
worker.hook->sink
<< "try"
- << (worker.getNrLocalBuilds() < settings.maxBuildJobs ? 1 : 0)
+ << (tryBuildLocally ? 1 : 0)
<< drv->platform
<< worker.store.printStorePath(drvPath)
<< parsedDrv->getRequiredSystemFeatures();
diff --git a/src/libstore/build/worker.cc b/src/libstore/build/worker.cc
index ee334d54a..83282d300 100644
--- a/src/libstore/build/worker.cc
+++ b/src/libstore/build/worker.cc
@@ -236,8 +236,11 @@ void Worker::waitForBuildSlot(GoalPtr goal)
{
debug("wait for build slot");
bool isSubstitutionGoal = goal->jobCategory() == JobCategory::Substitution;
- if ((!isSubstitutionGoal && getNrLocalBuilds() < settings.maxBuildJobs) ||
- (isSubstitutionGoal && getNrSubstitutions() < settings.maxSubstitutionJobs))
+ int maxJobs = isSubstitutionGoal
+ ? settings.maxSubstitutionJobs
+ : settings.maxBuildJobs; // FIXME
+ int curJobs = isSubstitutionGoal ? getNrSubstitutions() : getNrLocalBuilds();
+ if (curJobs < maxJobs)
wakeUp(goal); /* we can do it right away */
else
addToWeakGoals(wantingToBuild, goal);
diff --git a/src/libstore/globals.hh b/src/libstore/globals.hh
index 07f524858..590f7e5ea 100644
--- a/src/libstore/globals.hh
+++ b/src/libstore/globals.hh
@@ -159,6 +159,16 @@ public:
)",
{"build-max-jobs"}};
+ Setting<bool> buildLocalOnlyIfPreferred {
+ this, false,
+ "build-local-only-if-preferred",
+ R"(
+ If set to `true`, Nix will prevent local builds
+ (i.e. behave as if `max-jobs==0`) except for derivations
+ which have `preferLocalBuild` set to `true`.
+ )",
+ {"build-fallback"}};
+
Setting<unsigned int> maxSubstitutionJobs{
this, 16, "max-substitution-jobs",
R"(
diff --git a/src/libstore/parsed-derivations.cc b/src/libstore/parsed-derivations.cc
index cc4a94fab..1bbef4ffc 100644
--- a/src/libstore/parsed-derivations.cc
+++ b/src/libstore/parsed-derivations.cc
@@ -105,9 +105,12 @@ bool ParsedDerivation::canBuildLocally(Store & localStore) const
&& !drv.isBuiltin())
return false;
- if (settings.maxBuildJobs.get() == 0
- && !drv.isBuiltin())
- return false;
+ if (!drv.isBuiltin()) {
+ if (settings.maxBuildJobs.get() == 0)
+ return false;
+ if (settings.buildLocalOnlyIfPreferred && preferBuildLocally())
+ return false;
+ }
for (auto & feature : getRequiredSystemFeatures())
if (!localStore.systemFeatures.get().count(feature)) return false;
@@ -117,7 +120,12 @@ bool ParsedDerivation::canBuildLocally(Store & localStore) const
bool ParsedDerivation::willBuildLocally(Store & localStore) const
{
- return getBoolAttr("preferLocalBuild") && canBuildLocally(localStore);
+ return preferBuildLocally() && canBuildLocally(localStore);
+}
+
+bool ParsedDerivation::preferBuildLocally() const
+{
+ return getBoolAttr("preferLocalBuild") || drv.isBuiltin();
}
bool ParsedDerivation::substitutesAllowed() const
diff --git a/src/libstore/parsed-derivations.hh b/src/libstore/parsed-derivations.hh
index 71085a604..b3fde6981 100644
--- a/src/libstore/parsed-derivations.hh
+++ b/src/libstore/parsed-derivations.hh
@@ -37,6 +37,8 @@ public:
bool willBuildLocally(Store & localStore) const;
+ bool preferBuildLocally() const;
+
bool substitutesAllowed() const;
bool useUidRange() const;
In case it's useful to anybody, I've been carrying the attached patch for a while which does this for Nix>2.3:
This looks reasonable as a way to control the feature for this use-case. Can you make a PR so we can test/review/discuss?
edit: it might need to be marked as an unstable option
Is your feature request related to a problem? Please describe.
I have a very weak build server, which coordinates my nix builds. I want that all expensive builds definitely don’t happen on local host. That’s why I use
--max-jobs 0
. Derivations withpreferLocalBuild
should happen locally in my context because uploading their closure will often be very expensive, while building them is not.This was exactly the behaviour with nix 2.3 (and other people considered it a bug, see #1798), but I was happy.
With nix 2.4
--max-jobs 0
now overridespreferLocalBuild
. That broke my use case and increased traffic on my build servers a lot. [1]Describe the solution you'd like Not sure. I guess we need a flag to control this considering, that apparently both behaviours are wanted by some users.
Describe alternatives you've considered I have worked around the issue by not using
--max-jobs 0
and configuring so many remote builders (over 100 parallel jobs) that local builds never happen in practice. But that’s certainly not a solution that everyone has the resources for.Footnotes
[1]: This issue got worsened a lot by the fact, that a nix 2.4 client with a nix 2.3 build-server apparently doesn’t respect the
builders-use-substituters
option. It meant that the client had to upload all packages from my system closure to the server just for the server to put the symlinks together.