NixOS / nix

Nix, the purely functional package manager
https://nixos.org/
GNU Lesser General Public License v2.1
12.38k stars 1.49k forks source link

Flake inputs fetched and unpacked despite outputs being a cache hit #9570

Open pwaller opened 10 months ago

pwaller commented 10 months ago

Is your feature request related to a problem? Please describe.

Flake input sources are fetched and unpacked even if they are unneeded. If you have lots of large flake inputs sources, this becomes a big bottleneck and resource consumer (wall time, cpu time, disk io, disk storage, network bandwidth and github API calls) when fetching from a cache.

Describe the solution you'd like

Building a flake output which is a cache hit should not require fetching the input sources.

Describe alternatives you've considered

Additional context

Consider the following flake:

{
  # Needed for runCommandNoCC.
  inputs.nixpkgs.url = "github:nixos/nixpkgs/nixos-unstable";
  # Arbitrary old nixpkgs commit you're unlikely to have the sources for in your /nix/store directory.
  inputs.arbitrarySources.url = "github:nixos/nixpkgs/49e5e473182a44fd0cd9048e4a3a99ba1d47da37";
  outputs = { nixpkgs, arbitrarySources, ... }: {
    packages.x86_64-linux.default = nixpkgs.legacyPackages.x86_64-linux.runCommandNoCC "test" {} ''
      echo ${arbitrarySources}
      touch $out
    '';
  };
}

Note: arbitrarySources uses nixos/nixpkgs as an input, but if the default package is built or available via a substituter, the sources are no longer required (arbitrarySources is not a runtime dependency).

Expectation:

Problem:

Reproduction (whole block can be pasted including parentheses, runs in subshell with tracing switched on):

(
set -x
mkdir -p issue-9570 && cd issue-9570
cat > flake.nix <<'EOF'
{
  # Needed for runCommandNoCC.
  inputs.nixpkgs.url = "github:nixos/nixpkgs/nixos-unstable";
  # Arbitrary old nixpkgs commit you're unlikely to have the sources for in your /nix/store directory.
  inputs.arbitrarySources.url = "github:nixos/nixpkgs/49e5e473182a44fd0cd9048e4a3a99ba1d47da37";
  outputs = { nixpkgs, arbitrarySources, ... }: {
    packages.x86_64-linux.default = nixpkgs.legacyPackages.x86_64-linux.runCommandNoCC "test" {} ''
      echo ${arbitrarySources}
      touch $out
    '';
  };
}
EOF
# 1. First build fetches sources
time nix build path:.
# 2. Second build is a no-op (expected behaviour, also expected even if flake input source is missing)
time nix build path:.
# 3. Delete the sources (and the test.drv) from the store, working around known deletion issues using sudo, stdin and ignore-liveness.
nix-store -q --referrers-closure $(nix flake archive --json path:. | jq -r .inputs.arbitrarySources.path) | nix-store --option keep-derivations false --delete --stdin
# 4. Second build is a no-op (expected behaviour, also expected even if flake input source is missing)
time nix build path:.
# 5. Put state back how it was (roughly), by deleting the sources again so the reproducer can be run repeatedly; note that this also shows that arbitrarySources has been fetched again.
nix-store -q --referrers-closure $(nix flake archive --json path:. | jq -r .inputs.arbitrarySources.path) | nix-store --option keep-derivations false --delete --stdin
)

Reproduction output

+ mkdir -p issue-9570
+ cd issue-9570
+ cat
+ nix build path:.

real    0m4.789s
user    0m0.702s
sys 0m2.973s
+ nix build path:.

real    0m0.182s
user    0m0.098s
sys 0m0.058s
+ nix-store --option keep-derivations false --delete --stdin
++ nix flake archive --json path:.
++ jq -r .inputs.arbitrarySources.path
+ nix-store -q --referrers-closure /nix/store/ijny9v749dpicbcgvx6iwxk317dzsybs-source
finding garbage collector roots...
deleting '/nix/store/bza7cl6wfgbbr47cbj7g1wxaq86lyxzm-test.drv'
deleting '/nix/store/ijny9v749dpicbcgvx6iwxk317dzsybs-source'
deleting unused links...
note: currently hard linking saves 12503.42 MiB
2 store paths deleted, 117.31 MiB freed
+ nix build path:.

real    0m4.769s
user    0m0.698s
sys 0m2.936s
+ nix-store --option keep-derivations false --delete --stdin
++ nix flake archive --json path:.
++ jq -r .inputs.arbitrarySources.path
+ nix-store -q --referrers-closure /nix/store/ijny9v749dpicbcgvx6iwxk317dzsybs-source
finding garbage collector roots...
deleting '/nix/store/bza7cl6wfgbbr47cbj7g1wxaq86lyxzm-test.drv'
deleting '/nix/store/ijny9v749dpicbcgvx6iwxk317dzsybs-source'
deleting unused links...
note: currently hard linking saves 12503.42 MiB
2 store paths deleted, 117.31 MiB freed

What I expect to see

The above shows the following times to run nix build:

1. real 0m4.815s # OK
2. real 0m0.169s # OK
3. real 0m4.785s # Bad

I expect in the latter case that (3) should take as long as (2), not (1); the lost time in (3) is spent fetching and unpacking the sources.

nix build path:. --debug --verbose shows that the whole of nixpkgs is being unpacked in this scenario. It's not being downloaded again because the gzipped sources are also a cache hit (and not deleted by nix store --delete); but in my real world scenario where the package can be fetched from a substituter, I see eval hang while all the flake input sources are fetched and unpacked, which means waiting multiple minutes and consuming substantial resources.

I note that I've used nixpkgs as a stand-in here; I do not expect fixing the issue I've described to improve typical uses of nixpkgs very much, because those would actually involve eval'ing nixpkgs, whereas the scenario I describe only use the flake inputs as a src attribute to mkDerivation; in this case, the sources are only necessary if the derivation need to be built.

Priorities

Add :+1: to issues you find important.

pwaller commented 10 months ago

Attached a debug-verbose log: debug-verbose.log

There are two nixpkgs in the store:

  1. /nix/store/akd7khgf3bxk6ribvcigwq1adi9g8zi4-source (this is used for runCommandNoCC, I expect it is required to have those sources present).
  2. nix/store/ijny9v749dpicbcgvx6iwxk317dzsybs-source (this is inputs.arbitrarySource).

In the debug log, if the gzip is in the cache, we see, e.g.

using cache entry '{"name":"source","type":"file","url":"https://github.com/nixos/nixpkgs/archive/49e5e473182a44fd0cd9048e4a3a99ba1d47da37.tar.gz"}' -> '{"etag":"W/\"4c8866242df8aefa9e4d2f8c5cf54a7572504d57978f35d43b63554d6520e402\"","url":"https://codeload.github.com/NixOS/nixpkgs/tar.gz/49e5e473182a44fd0cd9048e4a3a99ba1d47da37"}', '/nix/store/wkwfbih24zqq6w853wfqz1xf6r1isvy6-source'
performing daemon worker op: 7
locking path '/nix/store/ijny9v749dpicbcgvx6iwxk317dzsybs-source'
lock acquired on '/nix/store/ijny9v749dpicbcgvx6iwxk317dzsybs-source.lock'
lock released on '/nix/store/ijny9v749dpicbcgvx6iwxk317dzsybs-source.lock'
performing daemon worker op: 26
checking access to '/nix/store/ijny9v749dpicbcgvx6iwxk317dzsybs-source/flake.nix'
evaluating file '/nix/store/ijny9v749dpicbcgvx6iwxk317dzsybs-source/flake.nix'

Indicating that something wants to eval the flake.nix for arbitrarySources.

Introducing inputs.arbitrarySources.flake = false; does not appear to help, it still says evaluating file '/nix/store/ijny9v749dpicbcgvx6iwxk317dzsybs-source/flake.nix' in the debug log.

(In my real scenario, I use inputs.*.flake = false for the majority of my inputs, as well).

pwaller commented 10 months ago

I've tested #6530, unfortunately that appears to make the situation worse, with it taking 14s instead of 4s to pull in the sources (where it shouldn't need them).

Key elements from the log show that flake.nix is still being evaluated from arbitrarySources even though I've set .flake = false;.

Key log output from that branch (575902bcbf57b8208ee1ed6544fed3887f3860e6).

evaluating file '«github:nixos/nixpkgs/49e5e473182a44fd0cd9048e4a3a99ba1d47da37»/flake.nix'
copying '«github:nixos/nixpkgs/49e5e473182a44fd0cd9048e4a3a99ba1d47da37»/' to the store...

The reproducer in this ticket still reproduces the problem on that branch.

pwaller commented 10 months ago

I've found a mistake on the above analysis for the source tree abstraction - it turns out call-flake.nix uses the flake lock file to determine if node.flake is false. I had set arbitrarySources.flake = false; but this had not propagated to the lock file. Deleting the lock file and recreating it had the desired effect. This doesn't appear to help flakes though: those still try to evaluate flake.nix here, where I believe what is needed is a merely a path.

https://github.com/NixOS/nix/blob/c8458bd731eb1c74159bebe459ea00165e056b65/src/libexpr/flake/call-flake.nix#L21

I see that "${arbitrarySource}" is evaluated by rendering arbitrarySource.outPath into a string, where the outPath comes from here:

https://github.com/NixOS/nix/blob/c8458bd731eb1c74159bebe459ea00165e056b65/src/libexpr/flake/call-flake.nix#L58

The missing primitive is one that would enable evaluating this outPath for the purposes of string interpolation without importing the flake.nix (via outputs -> flake.outputs -> import (outPath + "/flake.nix")). I'm not aware of whether such a primitive currently exists in the language. This would be needed so that the flake can behave both as flakes currently do (with all of their outputs defined on them), but also string interpolate without requiring that they get fetched.

pwaller commented 10 months ago

~I'm able to get the behaviour I want with the following flake - using fetchTree directly on the flake.lock json.~

{
  # Needed for runCommandNoCC.
  inputs.nixpkgs.url = "github:nixos/nixpkgs/nixos-unstable";
  # Arbitrary old nixpkgs commit you're unlikely to have the sources for in your /nix/store directory.
  inputs.arbitrarySources.url = "github:nixos/nixpkgs/49e5e473182a44fd0cd9048e4a3a99ba1d47da37";
  outputs = { nixpkgs, ... }: let
    flakeLock = builtins.fromJSON (builtins.readFile ./flake.lock);
    arbitrarySourcesOutPath = fetchTree flakeLock.nodes.arbitrarySources.locked;
  in {
    packages.x86_64-linux.default = nixpkgs.legacyPackages.x86_64-linux.runCommandNoCC "test" {} ''
      echo ${arbitrarySourcesOutPath}
      touch $out
    '';
  };
}

~This allows me to get the arbitrarySourcesOutPath; building this for a second time does not require the sources present.~

Edit: After further experimentation with the above, I'm confused. This does still appear to unpack the sources, so I must have been mistaken when I wrote this previously, at least I can't reproduce this result now; if the sources are missing but the build is present, I still witness the sources are fetched.

However, I have noticed that the branch on #6530 (Source tree abstraction) it still copies the sources to the store with the above flake:

copying '«github:nixos/nixpkgs/49e5e473182a44fd0cd9048e4a3a99ba1d47da37»/' to the store

The performance is quite a bit worse than the master branch. It takes 14 seconds to copy the nixpkgs to the store, where the pre-#6530 code took 4 seconds.

perf record shows that 76% of the time (10s) is spent in nix::SourceAccessor::dumpPath -> nix::SourceAccessor::lstat -> nix::GitInputAccessor::lookup -> git_tree_entry_bypath.

For the above case I would hope that it wouldn't materialise the tree at all when the default package has been built.

pwaller commented 10 months ago

I think maybe I misunderstood something about how fetchTree works. I see from the recent documentation: #9258 that fetchTree fetches the requested tree when it's called. I had thought/hoped/assumed it would work as fetchFromGithub and friends do: the sources there aren't required until they're used.

Thinking aloud: Presumably the mechanism behind fetchFromGitHub lazily fetching packages works because the fetching of a source lives in a separate derivation; and something using the path of the fetched source then gets a dependency on the separate derivation. And only if it's necessary to run the build does the source get fetched. These concepts aren't applicable for the fetchTree builtin, which fetches during evaluation, not while derivations get built?

So I take it then that I need to use a non-builtin fetchTree primitive if I want to get the effect I'm after.

tomberek commented 10 months ago

Likely related:

pwaller commented 10 months ago

Thanks for the link @tomberek, that sounds like exactly what I want, assuming the credentials issue can be sorted out.

I've actually been able to work around this for my use case, but I have to invoke 'nix flake archive' in order to populate the store (/substituters) with the sources.

After that, I can switch to the nixpkgs fetchers and feed them with the information from the lockfile via readJSON. This gives a much much much improved experience.

nixos-discourse commented 9 months ago

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/how-to-disable-automatic-unpacking-of-nix-flake-inputs/36911/2

lf- commented 8 months ago

Hi. The current correct way to do this in the case of build dependencies is to do it the same as nixpkgs and use a fixed output derivation.

Realistically you probably want to move those particular build time only sources out of flakes altogether and use something like niv, npins, or gridlock for a second lock file for those specific things (if using niv or npins, throw out the nix code they give you because it uses built in fetchers and then just use their json). See https://jade.fyi/blog/flakes-arent-real/