NixOS / nix

Nix, the purely functional package manager
https://nixos.org/
GNU Lesser General Public License v2.1
12.55k stars 1.5k forks source link

`error: null pointer cast to ref` when using nix-daemon from 2.0 #1926

Closed knedlsepp closed 3 months ago

knedlsepp commented 6 years ago

I have a single user installation of nixStable2 from nixos-17.09 for a non-root user A and a nix-daemon running as user A. Then I call source ~A/.nix-profile/etc/profile.d/nix-daemon.sh from user B and try to run a nix-env command. I get the following error:

error: null pointer cast to ref
error: unexpected EOF reading a line

This setup used to work with nix-1.11.

dtzWill commented 6 years ago

I'm not sure this is a supported configuration, but if there isn't already adding options to control/specify socket permissions seems reasonable.

I think here we just don't produce a nice error message in this case, I'll poke at reproducing this in a bit.

shlevy commented 6 years ago

This should be supported.

dtzWill commented 6 years ago

Hmm well easy answer of socket permissions didn't pan out ("hooray" that such problems are nicely handled, though :)).

And unfortunately I'm unable to reproduce.

One observation: Where does nix-env come from when executed by user B?

Consider using a proper multi-user installation? Sorry I don't have more helpful advice!

If you're determined to make this work, maybe provide some more information regarding where the failure occurs? Might be useful to run nix-daemon with strace and post the output generated when the error occurs?

No promises we'll get your setup working, but more information might help in case someone decides they want to try to investigate :).

dtzWill commented 6 years ago

This should be supported.

Really? Okay :innocent: . Using single-user install from other users seemed unusual, but I'll take your word for it. Wonder where the null pointer cast happens? Hopefully we can get it in a stacktrace or something...

knedlsepp commented 6 years ago

I'm using nix to get reproducible builds in a scientific environment on an HPC at our company. Been using nix for quite some time at home now and I'm fed up with the "environment modules" in the typical HPC setup. A multi-user installation is no option sadly, but I'm quite determined to get a working setup using nix 2.0, since features such as "nix search", "nix run" and using builtins.fetchGit without the sha256-checksum are quite the selling points. I'll try to give more information on this topic asap.

edolstra commented 6 years ago

What nix-env command fails?

knedlsepp commented 6 years ago

It seems all commands that trigger builds are affected. When a package is already in the store, nix-env -iA PKG works, but not if it needs to be built. When the package can be downloaded from the binary cache, it also works just fine. E.g.:

nix-shell -p "(with import <nixpkgs> {}; pythonPackages.matplotlib.overrideAttrs(o: { doCheck=true;}))" --debug

yields:

...
building of '/nix/store/ni7nfwmb3vya4i75fq601qqw5gvfha24-python2.7-matplotlib-2.0.2.drv': woken up
building of '/nix/store/ni7nfwmb3vya4i75fq601qqw5gvfha24-python2.7-matplotlib-2.0.2.drv': trying to build
locking path '/nix/store/jijq8chkgfhgmnafl3b0bsgr9nllchn9-python2.7-matplotlib-2.0.2'
lock acquired on '/nix/store/jijq8chkgfhgmnafl3b0bsgr9nllchn9-python2.7-matplotlib-2.0.2.lock'
removing invalid path '/nix/store/jijq8chkgfhgmnafl3b0bsgr9nllchn9-python2.7-matplotlib-2.0.2'
starting build hook '/nix/store/r46l7p43ky5vln2959dgmzc359k7csaw-nix-2.0/libexec/nix/build-remote'
error: null pointer cast to ref
building of '/nix/store/fli86msvg9y3brh4zqkh7wwc6lzlffig-stdenv.drv': goal destroyed
building of '/nix/store/nq559zxavahn5dsp0jmg0q8q2jkcq1jc-bash-4.4-p12.drv': goal destroyed
building of '/nix/store/2kzzy6hfxbx52c3kpxr9jp3v59wdgch5-bash-4.4-p12.drv': goal destroyed
substitution of '/nix/store/9krlzvny65gdc8s7kpb6lkx8cd02c25b-default-builder.sh': goal destroyed
lock released on '/nix/store/jijq8chkgfhgmnafl3b0bsgr9nllchn9-python2.7-matplotlib-2.0.2.lock'
building of '/nix/store/ni7nfwmb3vya4i75fq601qqw5gvfha24-python2.7-matplotlib-2.0.2.drv': goal destroyed
killing process 7456
error: unexpected EOF reading a line
jcumming commented 6 years ago

Just ran into this. A quick revert back to nix-1.11.16 is a workaround for now..

dtzWill commented 6 years ago

A quick review of this issue suggests-- correct me if I'm wrong please, folks-- that this can't be reproduced by anyone who's willing/able to fix it. If you're running into this any details you can share or other help reproducing would be appreciated! We would like to avoid users having to revert, but it's very hard to do that if we can reproduce.

Sorry you're having issues, please help us reproduce if you can! :)

jcumming commented 6 years ago

Here is my configuration:

Then building anything results in:

$ nix-shell -p hello
these derivations will be built:
  /tmp//store/cb2h62kxrqkzm8a8zyg314c25xpa420i-hello-2.10.tar.gz.drv
  /tmp//store/0xq1vbwr3gfb05fi72j2q3i9d023sl3f-hello-2.10.drv
building '/tmp//store/cb2h62kxrqkzm8a8zyg314c25xpa420i-hello-2.10.tar.gz.drv'...

trying http://tarballs.nixos.org/sha256/0ssi1wpaf7plaswqqjwigppsg5fyh99vdlb9kzl7c9lng89ndq1i
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  708k  100  708k    0     0  1561k      0 --:--:-- --:--:-- --:--:-- 1561k
error: null pointer cast to ref
error: unexpected EOF reading a line

What additional information can I get for you?

ghost commented 6 years ago

I'm hitting this as well. My configuration is the same as @jcumming but without the nix.conf for the nix-daemon and the hydra.

edolstra commented 6 years ago

I couldn't reproduce this. I ran

NIX_STATE_DIR=~/my-nix/nix/var nix-daemon --allowed-users '@users' --store ~/my-nix/

using Nix master, and

NIX_REMOTE=daemon NIX_STATE_DIR=/home/eelco/my-nix/nix/var nix-shell -p hello

using Nix 1.11.16.

knedlsepp commented 6 years ago

I think I could find something by digging a little deeper: It's NIX_REMOTE=daemon nix-daemon that's causing the failure. A NIX_REMOTE="" nix-daemon seems to work for me. This however worked with 1.11.16 without unsetting NIX_REMOTE. In my setup I used to soure ~A/.nix-profile/etc/profile.d/nix-daemon.sh (which in turn sets NIX_REMOTE=daemon) not only for the user running nix-shell, but also for the user running the nix-daemon.

knedlsepp commented 6 years ago

Reading through: https://github.com/NixOS/nix/blob/master/scripts/nix-profile-daemon.sh.in#L7-L9 I'm actually not sure if this script is supposed to work in a non-root setup, since it implicitly assumes that nix-daemon is running as root, but as it used to work fine on 1.11, I'd still love for nix-2.0 to be able to run the nix-daemon even if I source nix-daemon.sh.

unode commented 5 years ago

Hi all, I have pretty much the same setup as @jcumming minus the hydra setup. In my case I get the error with the same user running nix-daemon as long as NIX_REMOTE="daemon". If I unset NIX_REMOTE it works but bypasses nix-daemon entirely. The same error happens from other users where NIX_REMOTE is also set to "daemon".

Not sure if this is related but using high verbosity, nix/build-remote is mentioned just before the error. I tried calling build-remote directly and I get:

% /share/nix/store/g9mf64cmj1bw0sm2l3d7p252y1q3m5fv-nix-2.0.4/libexec/nix/build-remote       
@nix {"action":"msg","level":0,"msg":"\u001b[31;1merror:\u001b[0m called without required arguments\nTry '/share/nix/store/g9mf64cmj1bw0sm2l3d7p252y1q3m5fv-nix-2.0.4/libexec/nix/build-remote --help' for more information."}

trying to run the suggested command gives me:

% /share/nix/store/g9mf64cmj1bw0sm2l3d7p252y1q3m5fv-nix-2.0.4/libexec/nix/build-remote --help    
@nix {"action":"msg","level":0,"msg":"\u001b[31;1merror:\u001b[0m stoll"}

Could this be causing problems?

Also on an unrelated note, in some situations when I run htop installed through nix, it fails to resolve user/group IDs (network db) to names (and defaults to displaying only the uid/gid). Could this be happening as well in nix causing allowed_groups=@group to fail to resolve (guessing here)?

stale[bot] commented 3 years ago

I marked this as stale due to inactivity. → More info

roberth commented 3 months ago

Closing as stale after 5 years.

These versions are ancient by now, and the issue has probably been fixed, or the bug died with the versions it inhabited. Nonetheless, please open a new issue if you encounter something like this again.