Open pikajude opened 6 years ago
As an aside, with the closer integration between nix and the "build hook", it'd be nice if the debug messages were more helpful than just decline :) (ssh failed, wrong arch, busy, trusted users not setup right, missing feature X)
Added some debug logs in https://github.com/NixOS/nix/pull/3425
I would definitely appreciate more logs about this topic!
-vvv
shows something on Nix 2.3.3:
considering building on remote machine 'ssh://static-haskell-nix-ci'
hook reply is 'decline'
But this still is not enough for me. Why doesn it not work?
I'm building with
--max-jobs 0 -A pkgs.haskellPackages.hspec --argstr system i686-linux -vvv
and have in my --builders
file:
static-haskell-nix-ci x86_64-linux - 4 - big-parallel
so perhaps it's that it doesn't want to build on x86_64-linux
because I say i686-linux
? No idea; it even fails if I replace the architecture in there.
I think the decline
ing should definitely say the reason why.
Edit for my memory: Workaround from the commandline:
nix-store -r --builders 'ssh://static-haskell-nix-ci x86_64-linux - - - big-parallel,kvm' --option builders-use-substitutes true --max-jobs 0
cc @arianvp since he got this error yesterday. Seems to be a problem with feature flags, architecture mismatch or lack of trusted-users permission.
This indeed cost me at least 2 days of braintwisting. I was missing a big-parallel
feature in my machine and this was happening when I tried building the kernel.
logging why something is decline
d would be very appreciated
I moved the debug logs now to this seperate PR: https://github.com/NixOS/nix/pull/3586
This indeed cost me at least 2 days of braintwisting. I was missing a big-parallel feature in my machine and this was happening when I tried building the kernel.
Aha! This weekend I hit an issue where my build refused to compile LLVM on my enormous remote build machine, and since I couldn't figure out why I resorted to just waiting for it to compile locally on my laptop. If it's any consolation, your comment saved me a good bit of debugging :)
It looks like this particular error has a nice proposal for improvement in @bburdette's https://github.com/bburdette/nix-error-proposal/blob/master/proposal.md#error-example
Looking forward to the day when this scenario produces this instead!
Today, I've also stumbled over this issue. There is really no good way to debug this. :(
I've already donated to https://opencollective.com/nix-errors-enhancement :)
The C++ "none of the overloads of the function matches your call" errors I've been seeing a lot lately (:D :D :D) are a user interface. We might learn from them in e.g. sorting the builders by "distance" to the request, and perhaps only showing n closest matches.
PR #3897 helps with this.
I made a follow-up PR to add more details in #3927.
I marked this as stale due to inactivity. → More info
I hit this attempting to remote build a system configuration. Helping with the search-ability a bit:
nix build -f '<nixpkgs/nixos>' config.system.build.toplevel --max-jobs 0 --builders 'ssh://nix-builder x86_64-linux - 8 -'
results in
error: unable to start any build; either increase '--max-jobs' or enable remote builds
The fix is to add big-parallel explicitly as mentioned above:
nix build -f '<nixpkgs/nixos>' config.system.build.toplevel --max-jobs 0 --builders 'ssh://nix-builder x86_64-linux - 8 - big-parallel'
I marked this as stale due to inactivity. → More info
Still an issue. Took me half an hour to figure out what was wrong here, with the relevant line buried in thousands of lines of logs. I should not have to crank up the verbosity so much to get that error report. The culprit turned out to be:
ignoring the client-specified setting 'builders', because it is a restricted setting and you are not a trusted user
IMHO this particular case should not trigger a log entry that only shows up on high verbosity levels, but a hard error: I was clearly using this tool wrong.
Derivation in question is a
fetchgit
derivation. In nix 1.11, when a remote builder refused to build a derivation, or nix didn't bother asking the remote builder, there would be a debug message along the lines ofhook reply is "decline"
. No such message is printed anymore, making it a lot harder to figure out what's going wrong: