Open stuser81 opened 1 year ago
Note: On my laptop I ran
sudo systemctl stop nix-daemon.service
andsudo systemctl start nix-daemon.service
but I'm not sure if that was needed.
If you change the nix.conf file you need to restart the daemon.
builders = ssh://192.168.1.80 - - 10
makes sure 10 cores are used on the desktop.
The 10 there is for maxJobs. I would recommend to use https://search.nixos.org/options?channel=unstable&from=0&size=50&sort=relevance&type=packages&query=nix.buildMachine to avoid confusion.
Modifying
~/.config/nix/nix.conf
on my laptop wasn't causing the remote builder to get registered at all. (This is possibly a separate bug altogether.)
No, it is not. The builders are read by the daemon which only read /etc/nix/nix.conf.
This is unexpected because as we already saw,
cache.nixos.org
should already have everything needed. (I have not modified the default substituters on the desktop NixOS machine.) It seems thatbuilders-use-substitutes = true
is not working properly.
There is likely something else happening, too. Is there any log indicating that the substitution failed? What are the substituters on the build machine? Is the command working as expected if you run it directly on the build machine?
What are the substituters on the build machine? Is the command working as expected if you run it directly on the build machine?
@SuperSandro2000 I have not changed the default substituters on the build machine (it's still cache.nixos.org
there). Yes, nix-shell -p ghc
works fine on the desktop build machine (just like it did on the laptop before I made the nix.conf changes).
I dig a bit more digging:
Then I noticed interesting things:
nix-shell -p ghc
on the desktop build machine, subsequent nix-shell -p ghc
on the laptop works fine and copies the GHC from the desktop to the laptop because it's already available on the desktop (it's the same GHC since we're on the same channel on both machines now).nix-shell -p ghc
on the laptop (without doing anything on the desktop build machine), the desktop build machine starts building things - instead of grabbing things from cache.nixos.org
. This is the problem identified in my original post. Any ideas what could be causing this?Here is a wild guess (take everything below with a grain of salt because it's just a guess):
builders-use-substitutes
behavior: "In practical terms, this means that remote hosts will fetch as many build dependencies as possible from their own substitutes (e.g, from cache.nixos.org), instead of waiting for this host to upload them all."cache.nixos.org
, the idea being it's adding a new "competitor" to the laptop.cache.nixos.org
?Does this guess have any merit?
- Why not make the desktop build machine always try to grab things from
cache.nixos.org
?
I recently misconfigured my substituters setting and through that build everything on remote builders which in fact downloaded the derivations from cache.nixos.org
- Any ideas what could be causing this?
Not really. How are you installing Nix? Are you using the installer? Can you try it with a NixOS machine?
Can you try it with a NixOS machine?
@SuperSandro2000 I just tried. I noticed that on NixOS, nix.settings.substituters = ["ssh://192.168.1.80"]
actually results in https://cache.nixos.org/
getting appended to the end automatically (as you can see with nix show-config
). This does not happen with my Nix multi-user install on Debian.
So I suspect you were hitting https://cache.nixos.org/
through the laptop machine, not the desktop build machine.
So I believe we've been comparing apples and oranges.
Is there any way to get NixOS to not do this strange automatic appending? Related: https://github.com/NixOS/nixpkgs/issues/158356 It seems people want this automatic appending, which feels strange to me. What about people who also want to offload the cache downloading to the build machine?
I finally got it to work. These are the main changes I made since last time:
nix.settings.trusted-users = ["user"];
to the build server's configuration.nix
. This appears to be important for reasons covered here: https://github.com/NixOS/nix/issues/2789nix.extraOptions = "builders-use-substitutes = true"
also to the build server, but I have no idea if (kinda doubt) this did anything useful. Mentioning this for completeness.require-sigs = false
and changed builders
in /etc/nix/nix.conf
:build-users-group = nixbld
extra-experimental-features = nix-command flakes
builders-use-substitutes = true
trusted-substituters = ssh://192.168.1.80
require-sigs = false
substituters = ssh://192.168.1.80
max-jobs = 0
builders = ssh://192.168.1.80 x86_64-linux - 10 2 benchmark,big-parallel,kvm,nixos-test - -
Things started working after that. (I can't guarantee I did nothing else important but I doubt it.) I don't think the root vs. regular user thing was the real issue. The real issue was maybe that I didn't have any trusted-users
before (which would have been nix.settings.trusted-users = ["root"];
while I was still accessing root) but I'm too lazy to verify it now. Also require-sigs = false
avoided some warnings about untrusted substituters, which was possibly very important. (The builders
change was likely only a minor change to fix an error about big-parallel
being missing for one of the packages.)
TODO room for improvements: Replace require-sigs = false
with a thing that only trusts cache.nixos.org
public key (as the NixOS build server, or any Nix installation, does by default. It somehow makes sense that downstream machines, the laptop client machine in my case, also need to trust it if they want packages from it).
Worth noting:
1.
When the client machine shows you tons of lines like
building '/nix/store/zsydnl4207d2vaa9n8kzksccqvv37npq-foo-3.0.0.drv' on 'ssh://192.168.1.80'...
it will not display if the build server is actually building it or if it's downloading from a substituter (e.g. cache.nixos.org
). To find out which, I kept bmon
(network monitor) and top
(CPU monitor) open at the same time on the build server. I was indeed noticing downloads (no top
activity) when I was grabbing something cached online and real building (top
activity and build machine fan noise) when building something custom.
2.
The inability of NixOS clients to leave cache.nixos.org
out of substituters (which I can on Nix on Debian) remains a real issue - but it's a separate issue.
Yep, just started doing remote builds again recently and I'm seeing this again too.
edit: I don't mean to be too whiny, but after years, it's disappointing how little confidence I have in many scenarios around remote building.
And now I can't tell (after adding a regular "trusted-user") if it's working or not because it's still copying sources up. It would be great if there was a way to do a remote build with NIX_STORE=ssh-ng://
that acted like a local build and didn't do any copying.
EDIT: this is probably the result of me copying derivations to the remote, and the fact that some of my config uses IFD ?
I'm still not able to get remote builder to pull packages from cache directly, it always goes through my local machine. Is there any exact steps if anybody got it right?
EDIT: Turns out I didn't had substituters set up on the remote build machine, adding a nix-channel and updating fixed it.
I see this as well. This makes remote builds longer than building locally :s I've checked my local and remote configs and everything looks sensible to me, I have no idea why I have to push stuff from my machine :s It doesn't help that the nix-daemon is very quit. Is there any flag to make it more verbose (--help is not that helpful either) ?
Also the nix client says:
copying /nix/store/.... to ssh://builder
which is a bit ambiguous since you dont know if it's copying from cache or locally. bandwhich
showed me that my machine was uploading to the builder.
I have two machines on my LAN:
My goals:
cache.nixos.org
when possible.This is what happens on the laptop without remote building setup (i.e. without any
~/.config/nix/nix.conf
and without any modifications to/etc/nix/nix.conf
). I have truncated the output, but in fact everything is already available oncache.nixos.org
so it gets downloaded from there.Let's now set up remote building:
ssh root@192.168.1.80
without password (using the normal SSH key stuff)./etc/nix/nix.conf
(only the first line was there by default):builders-use-substitutes = true
asks the desktop to grab fromcache.nixos.org
if possible,max-jobs = 0
makes sure no building takes place on the laptop, andbuilders = ssh://192.168.1.80 - - 10
makes sure 10 cores are used on the desktop.Note: On my laptop I ran
sudo systemctl stop nix-daemon.service
andsudo systemctl start nix-daemon.service
but I'm not sure if that was needed. However, I noticed I indeed had to modify/etc/nix/nix.conf
on my laptop. Modifying~/.config/nix/nix.conf
on my laptop wasn't causing the remote builder to get registered at all. (This is possibly a separate bug altogether.)The problem:
Check out what happens now. As you can see, it's building things on the desktop machine! This is unexpected because as we already saw,
cache.nixos.org
should already have everything needed. (I have not modified the default substituters on the desktop NixOS machine.) It seems thatbuilders-use-substitutes = true
is not working properly.