NixOS / nix

Nix, the purely functional package manager
https://nixos.org/
GNU Lesser General Public License v2.1
12.59k stars 1.51k forks source link

Evaluating nixpkgs takes a huge RAM #7308

Closed tobiasBora closed 1 year ago

tobiasBora commented 1 year ago

Describe the bug

All the following commands takes a lot of RAM (I'd say around 1G from looking at htop, time tells me 500M) and time (10mn, and it's even worse when the system starts to swap because the RAM is full…) to evaluate on a raspberry pi 3b, even if I they do a no-op (i.e. if the current system is already running this version):

$ nixos-rebuild switch --no-flake
$ nixos-rebuild switch --flake .
$ nix flake lock --update-input …

As far as I see this is not visible on my main laptop… no idea why.

I'm also confused, sometime nix run nixpkgs#… is really long, sometimes takes no time at all.

$ nix run nixpkgs#time # takes a lot of time
$ nix run nixpkgs#hello # takes no time (maybe nixpkgs#time has done some caching?)

Steps To Reproduce

  1. Start NixOs on the raspberry pi 3 (you may need to add a swap file or it will freeze when no more RAM is available)
  2. Run twice someting like that (tested on )
    # env time -v sudo nixos-rebuild switch --flake github:cwi-foosball/foosball#foosballrasp

I also tried with the non-flake version and I got similar issues:

# /nix/store/jw4jjw6ml5vymjw0yhqg1i9dln12g9k4-time-1.9/bin/time -v sudo nixos-rebuild switch -I nixos-config=configuration.nix -I "nixpkgs=/nix/store/7mffl2hq695yjvgh18vgrpqqn9cr2i1f-source" --no-flake
building Nix...
building the system configuration...
activating the configuration...
setting up /etc...
reloading user units for pi...
setting up tmpfiles
    Command being timed: "sudo nixos-rebuild switch -I nixos-config=configuration.nix -I nixpkgs=/nix/store/7mffl2hq695yjvgh18vgrpqqn9cr2i1f-source --no-flake"
    User time (seconds): 96.84
    System time (seconds): 54.42
    Percent of CPU this job got: 30%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 8:22.45
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 504280
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 97086
    Minor (reclaiming a frame) page faults: 873927
    Voluntary context switches: 173974
    Involuntary context switches: 278897
    Swaps: 0
    File system inputs: 2584120
    File system outputs: 428560
    Socket messages sent: 0
    Socket messages received: 0
    Signals delivered: 0
    Page size (bytes): 4096
    Exit status: 0

Expected behavior

I expect nix to take close to no RAM/time, especially for a no-op operation.

nix-env --version output

2.11.0

roberth commented 1 year ago

I'm also confused, sometime nix run nixpkgs#… is really long, sometimes takes no time at all.

$ nix run nixpkgs#time # takes a lot of time
$ nix run nixpkgs#hello # takes no time (maybe nixpkgs#time has done some caching?)

I can think of some effects.

  1. The mutable flake cache. nixpkgs is a mutable flakeref. It may have been downloaded the first time and cached on the second.

  2. Instantiation. Nix creates a bunch of .drv files. If they already exist, this process may be faster.

  3. Evaluation cache. This is basically per installable, and it doesn't cache any dependencies. If we make nixpkgs take dependencies from its self, this PR might help with this https://github.com/NixOS/nix/pull/4511

I expect nix to take close to no RAM/time, especially for a no-op operation.

(2) and (3) may be improved, perhaps.

Improving memory usage is by no means easy. By default, a large number of derivations must be evaluated, and cache invalidation is hard ;). Nix expressions do tend to hold on to more values and thunks than you might expect it to, but these are usually required for computations that could be done, but won't be done. It'd be interesting to see how this could be improved by significantly changing the interpreter, but this would be a research project, not an easy fix. That said, we've accepted a number of performance improvements since the 2.4 release (thanks pennae!), and perhaps some fresh eyes (or more of the same eyes) could find more of such incremental improvements.

tobiasBora commented 1 year ago

Thanks a lot for the answer. So also, I did some tests to see if the problem was coming from flake or nix, and when I nixos-rebuild switch twice a system without flake twice, it also takes like 10mn the two times, even without change. So I guess it takes time to evaluate all the NixOs modules, even if nearly all of them are disabled. Don't know if it would be possible somehow to avoid evaluating useless modules.

roberth commented 1 year ago

if it would be possible somehow to avoid evaluating useless modules.

There's https://github.com/NixOS/rfcs/pull/22 which I think is also good for actual modularity, and there's potential for a workaround, although that would tend to make the already complicated module system more complicated.

roberth commented 1 year ago

https://github.com/NixOS/nix/issues/8621 is a duplicate of this, but I'm closing this one, as most of the conversation here is about the specific observations that do not help with solving the problem. So any further conversation can continue in the fresh issue