NixOS / nix

Nix, the purely functional package manager
https://nixos.org/
GNU Lesser General Public License v2.1
12.18k stars 1.47k forks source link

Make it possible to only build derivations with `preferLocalBuild` locally. #5646

Open maralorn opened 2 years ago

maralorn commented 2 years ago

Is your feature request related to a problem? Please describe.

I have a very weak build server, which coordinates my nix builds. I want that all expensive builds definitely don’t happen on local host. That’s why I use --max-jobs 0. Derivations with preferLocalBuild should happen locally in my context because uploading their closure will often be very expensive, while building them is not.

This was exactly the behaviour with nix 2.3 (and other people considered it a bug, see #1798), but I was happy.

With nix 2.4 --max-jobs 0 now overrides preferLocalBuild. That broke my use case and increased traffic on my build servers a lot. [1]

Describe the solution you'd like Not sure. I guess we need a flag to control this considering, that apparently both behaviours are wanted by some users.

Describe alternatives you've considered I have worked around the issue by not using --max-jobs 0 and configuring so many remote builders (over 100 parallel jobs) that local builds never happen in practice. But that’s certainly not a solution that everyone has the resources for.

Footnotes

[1]: This issue got worsened a lot by the fact, that a nix 2.4 client with a nix 2.3 build-server apparently doesn’t respect the builders-use-substitutersoption. It meant that the client had to upload all packages from my system closure to the server just for the server to put the symlinks together.

nixos-discourse commented 2 years ago

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nix-2-4-and-what-s-next/16257/1

rickynils commented 2 years ago

For reference, here's the discussion that initiated the changed behavior of --max-jobs 0 in 2.4: https://github.com/NixOS/nix/issues/4442#issuecomment-758319324

rickynils commented 2 years ago

I agree with @maralorn that it should be able to instruct Nix both to never build anything locally (which is what --max-jobs 0 means in 2.4), and to only build derivations with preferLocalBuild locally (this issue). If we add a new option for the latter case, then the max-jobs option would still make sense for controlling how many preferLocalBuild jobs that can run in parallel.

fendor commented 2 years ago

Hello!

Is there an update on this front? We are upgrading some infrastructure from nix 2.3 to 2.4, and we are running into the issue (presumably) that preferLocalBuild isn't honoured in combination with --max--jobs 0. Our expected behaviour is: Build nothing locally except derivations marked by preferLocalBuild = true.

According to this issue (or at least how I read it here), it was an intentional change with not migration path, is that still the case?

Adding an option that signals that every derivation except those marked by perferLocalBuild = true sounds like a sensible solution to me.

rickynils commented 2 years ago

@fendor Can you detail the issues you have with preferLocalBuild derivations being sent to remote builders? It would be helpful to have clear use cases for this new option.

fendor commented 2 years ago

The details are very similar to the original issue: We have multiple build servers and not-that-beefy vm's that should not be used for executing CPU intensive tasks. To achieve that, we use max-jobs 0. If we had something like max-jobs 1 we might run into the risk of executing CPU intensive tasks once in a while, slowing down the overall build-process.

The issue now comes when we have cheap jobs, such as bundling a lot of symbolic links together. Afaict, all derivations need to be uploaded to a single builder (which is very unlikely to have all derivation locally available), a cheap operation is performed, and then we downloaded the result again. In this case, it would have been much quicker to perform the operation locally.

Does this clarify our use-case?

fendor commented 2 years ago

How about adding a flag --prefer-remote-builders that, when enabled, has the following semantics:

fendor commented 2 years ago

I am looking into reviving the original proposal from NixOS/nix#3810

TLDR;

Concretely, I want to:

However, I feel like this doesn't quite catch all the use-cases. In the table here:

max-jobs lp:local lp:remote-builder Outcome
0 x 0 Execute all builds remotely, fail if drv has preferLocalBuild
0 x 1 Execute all builds remotely, build preferLocalBuild drv remotely
auto 0 0 Execute builds remotely and locally, fail if drv has preferLocalBuild
auto 0 1 Execute builds remotely and locally, build preferLocalBuild remotely
auto 1 0 Execute builds remotely and locally, build preferLocalBuild locally
auto 1 1 Execute builds remotely and locally, build preferLocalBuild remotely and locally

With the following caption:

Implicit assumption: at least one remote builder is available, otherwise everything with remote only fails.

However, given these outcomes, I am actually missing the following:

Did I miss anything, or is that not expressible?

Ericson2314 commented 2 years ago

@fendor Perhaps if we had a "remote-preferred" (imaginary) feature we can fill out the table?

fendor commented 2 years ago

I like this idea, and shows that the idea generalises nicely.

However, the name remote-preferred is probably not ideal as it feels a bit analogous to local-preferred and what would that mean if a remote-server has this feature enabled?

Ericson2314 commented 2 years ago

I am not quite sure what you are asking? local-prefered and remote-prefered applied to machines are arbitrarily does indeed allow some silly combinations, but I am not sure how to prevent that given there are evidently so many opinions on what this feature ought to do!

fendor commented 2 years ago

Ok, I have filled out the behaviour matrix again :D

Assumptions, at least one builder with none of the two features enabled.

max-jobs local-preferred remote-preferred Outcome
0 x x Build remotely, fail if drv has preferLocalBuild
auto 0 0 Build remotely and locally, fail if drv has preferLocalBuild
auto 0 1 Build remotely, fail if drv has preferLocalBuild
auto 1 0 Build remotely and locally, but build preferLocalBuild locally
auto 1 1 Build remotely, but build preferLocalBuild locally

Covers my use-case, and in combination from above, seems to be satisfactory.

What I meant is basically bike-shedding about the name. remote-preferred feels like the opposite of local-preferred, but it is not, the latter is a special feature for the attribute preferLocalBuild while the former describes where derivations are built by default. remote-preferred meaning basically: "not here if possible", while local-preferred means "only here".

However, modulo the name, I think the feature is pretty nice, as it is now possible to specify that a remote server is only responsible for building drv that have a preferLocalBuild attribute. (I am not sure, though, whether this is easy to implement in the code)

Looking good, are we covering all cases now?

Ericson2314 commented 2 years ago

Hmm so the way the features normally work is that if a derivation has a feature, it must be built with the builder with that feature.

I was thinking both local-preferred and remote-preferred would be of that nature, so we could also (literally or figuratively, not sure) set the feature on the relevant builders, but vary whether what the derivations one require.

fendor commented 2 years ago

Ah, I see, I had a slight misunderstanding of how features work. Then I still don't like the names, but the idea is just as valid, even better, more predictable.

Ericson2314 commented 2 years ago

Sure. it might just be better to call them local and remote. Derivations can require locality or remoteness, and likewise builders and offer locality or remoteness.

fendor commented 2 years ago

Still feels weird, since if a derivation requires the feature remote it will still be built locally if the local machine has the remote feature, right? Just feels a bit off.

Ericson2314 commented 2 years ago

@fendor likewise is a derivation requires local and the remote machine has the local feature? Icky, but symmetric. No?

jsoo1 commented 1 year ago

We also have a similar situation. We already have a pseudo-system-feature for derivations we can try building on the low-powered coordinator. Perhaps another solution could be creating a mandatory-system-features setting to mirror the mandatoryFeatures build machine configuration?

ghost commented 10 months ago

In case it's useful to anybody, I've been carrying the attached patch for a while which does this for Nix>2.3:

diff --git a/src/libstore/build/derivation-goal.cc b/src/libstore/build/derivation-goal.cc
index df7d21e54..2d773a4df 100644
--- a/src/libstore/build/derivation-goal.cc
+++ b/src/libstore/build/derivation-goal.cc
@@ -1083,10 +1083,14 @@ HookReply DerivationGoal::tryBuildHook()

     try {

+        bool tryBuildLocally = worker.getNrLocalBuilds() < settings.maxBuildJobs;
+        if (settings.buildLocalOnlyIfPreferred && !parsedDrv->preferBuildLocally())
+            tryBuildLocally = false;
+
         /* Send the request to the hook. */
         worker.hook->sink
             << "try"
-            << (worker.getNrLocalBuilds() < settings.maxBuildJobs ? 1 : 0)
+            << (tryBuildLocally ? 1 : 0)
             << drv->platform
             << worker.store.printStorePath(drvPath)
             << parsedDrv->getRequiredSystemFeatures();
diff --git a/src/libstore/build/worker.cc b/src/libstore/build/worker.cc
index ee334d54a..83282d300 100644
--- a/src/libstore/build/worker.cc
+++ b/src/libstore/build/worker.cc
@@ -236,8 +236,11 @@ void Worker::waitForBuildSlot(GoalPtr goal)
 {
     debug("wait for build slot");
     bool isSubstitutionGoal = goal->jobCategory() == JobCategory::Substitution;
-    if ((!isSubstitutionGoal && getNrLocalBuilds() < settings.maxBuildJobs) ||
-        (isSubstitutionGoal && getNrSubstitutions() < settings.maxSubstitutionJobs))
+    int maxJobs = isSubstitutionGoal
+        ? settings.maxSubstitutionJobs
+        : settings.maxBuildJobs; // FIXME
+    int curJobs = isSubstitutionGoal ? getNrSubstitutions() : getNrLocalBuilds();
+    if (curJobs < maxJobs)
         wakeUp(goal); /* we can do it right away */
     else
         addToWeakGoals(wantingToBuild, goal);
diff --git a/src/libstore/globals.hh b/src/libstore/globals.hh
index 07f524858..590f7e5ea 100644
--- a/src/libstore/globals.hh
+++ b/src/libstore/globals.hh
@@ -159,6 +159,16 @@ public:
         )",
         {"build-max-jobs"}};

+    Setting<bool> buildLocalOnlyIfPreferred {
+        this, false,
+        "build-local-only-if-preferred",
+        R"(
+          If set to `true`, Nix will prevent local builds
+          (i.e. behave as if `max-jobs==0`) except for derivations
+          which have `preferLocalBuild` set to `true`.
+        )",
+        {"build-fallback"}};
+
     Setting<unsigned int> maxSubstitutionJobs{
         this, 16, "max-substitution-jobs",
         R"(
diff --git a/src/libstore/parsed-derivations.cc b/src/libstore/parsed-derivations.cc
index cc4a94fab..1bbef4ffc 100644
--- a/src/libstore/parsed-derivations.cc
+++ b/src/libstore/parsed-derivations.cc
@@ -105,9 +105,12 @@ bool ParsedDerivation::canBuildLocally(Store & localStore) const
         && !drv.isBuiltin())
         return false;

-    if (settings.maxBuildJobs.get() == 0
-        && !drv.isBuiltin())
-        return false;
+    if (!drv.isBuiltin()) {
+        if (settings.maxBuildJobs.get() == 0)
+            return false;
+        if (settings.buildLocalOnlyIfPreferred && preferBuildLocally())
+            return false;
+    }

     for (auto & feature : getRequiredSystemFeatures())
         if (!localStore.systemFeatures.get().count(feature)) return false;
@@ -117,7 +120,12 @@ bool ParsedDerivation::canBuildLocally(Store & localStore) const

 bool ParsedDerivation::willBuildLocally(Store & localStore) const
 {
-    return getBoolAttr("preferLocalBuild") && canBuildLocally(localStore);
+    return preferBuildLocally() && canBuildLocally(localStore);
+}
+
+bool ParsedDerivation::preferBuildLocally() const
+{
+    return getBoolAttr("preferLocalBuild") || drv.isBuiltin();
 }

 bool ParsedDerivation::substitutesAllowed() const
diff --git a/src/libstore/parsed-derivations.hh b/src/libstore/parsed-derivations.hh
index 71085a604..b3fde6981 100644
--- a/src/libstore/parsed-derivations.hh
+++ b/src/libstore/parsed-derivations.hh
@@ -37,6 +37,8 @@ public:

     bool willBuildLocally(Store & localStore) const;

+    bool preferBuildLocally() const;
+
     bool substitutesAllowed() const;

     bool useUidRange() const;
tomberek commented 10 months ago

In case it's useful to anybody, I've been carrying the attached patch for a while which does this for Nix>2.3:

This looks reasonable as a way to control the feature for this use-case. Can you make a PR so we can test/review/discuss?

edit: it might need to be marked as an unstable option