Open edolstra opened 5 years ago
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/flakes-without-git-copies-entire-tree-to-nix-store/10743/2
Still important.
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/my-painpoints-with-flakes/9750/20
It would be nice if Nix could take advantage of the filesystem's native CoW functionality (if present) in order to speed up copying. We discussed this briefly in #offtopic:nixos.org.
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/is-it-possible-to-make-a-flake-that-has-no-source-tree/16037/2
I just hit this. My first attempt at a workaround was to remove the self
arg from outputs
(as without self
I can't access the source tree at all). It turns out that makes it copy the tree and then throw an error about how the outputs
function doesn't take a self
arg.
Lazily copying the flake only when outPath
is evaluated would be ideal, but being able to just drop the self
arg to suppress the copy would be a great first step.
@L-as Taking advantage of CoW would be nice but it's not doable on macOS where the Nix store lives on a separate volume (separate volume group even).
Also for context, in my case the flake was not in a git repo, it was just in a folder. Copying to the nix store is unacceptable because the folder contains multiple git repos along with all their build artifacts. Copying a git repo to the Nix store at least would avoid copying untracked files, but in my case it had hundreds of thousands of files and multiple gigabytes of data to look at and copy.
@lilyball @L-as Taking advantage of CoW doesn't work on Linux either due to a VFS limitation: https://github.com/NixOS/nix/issues/5513
One thing I'd like to understand in this issue is why a local flake can't be evaluated "directly" just like the old default.nix
-style file evaluation.
I know copying has benefits for hermetic evaluation and such but I don't need that, like, at all.
Sure, remote flakes should be copied to the Nix store and that's really great functionality but I see no point whatsoever in doing the same for local flakes that are already in the FS and not expected to change without the user's knowledge.
@Atemu from what I've read, it's to help enforce hermetic evaluation and avoid impurities. Presumably it also has advantages for code simplicity, because you don't need to write something separate for local flakes.
I agree it's not great UX for those of us who use flakes just to keep track of a dev shell, of course :)
it's to help enforce hermetic evaluation and avoid impurities.
And that's great but I don't see any point in hermetic eval on local files.
it's not great UX for those of us who use flakes just to keep track of a dev shell
It's also bad UX for anyone working on nix-built projects.
Correct me if I'm wrong here but if I was I'm hacking on Nixpkgs to solve some some bug in NixOS with dirty trees (because obviously, I'm hacking), Nix copies the entire 313MiB Nixpkgs checkout to the Nix store every time I eval.
Not only does that take quite a while (even on an SSD it's multiple seconds) but it also causes unnecessary writes. After 70 Nixpkgs evals, you've exhausted the expected daily writes to an SSD. That can't be good for endurance.
Is it just me or is that insane?
but I don't see any point in hermetic eval on local files.
You might not realize you're using local files, accidentally sneak in state, and then be surprised when it doesn't evaluate in deployment (and be all "wait, isn't nix supposed to prevent this?"). Even with fully local files, I'd expect things still to work if I move my directory to a new computer from a restored backup. While I've personally learned when and where local state might happen, it's still a safety net that I consider nice to have.
Of course, giant copies for the tiniest delta is way too much of a cost to incur for that, but this is why we're here - to make sure that flakes don't blow up SSDs all over the place when they finally become non-experimental ;)
You might not realize you're using local files, accidentally sneak in state, and then be surprised when it doesn't evaluate in deployment (and be all "wait, isn't nix supposed to prevent this?").
I don't understand what you mean by that.
How is copying the accidentally added state over to the Nix store first and then evaling it any better than just evaling it directly?
Even with fully local files, I'd expect things still to work if I move my directory to a new computer from a restored backup. While I've personally learned when and where local state might happen, it's still a safety net that I consider nice to have.
How is the location of the directory related to any of this? A direct eval of the same state of a directory in another location will have the same result. How should copying improve anything?
IIRC files that are tracked with git already (and changed) are being staged and then copied to the store. I can see how this ensures that at least the files are tracked and marked as updated (by staging them). I also kind of agree that I think this is the wrong solution to the problem, or perhaps a solution in search of a problem? Most of the time it is very expensive to copy my working directory into the store.
Since I can see why that feature is useful, I'd argue that it should be configurable if you want your flake repos to be copied to the store or not. As far as I know, the hash of the path that is added to the store is also currently used for the eval caching.
Perhaps the current implementation is a nice PoC of how more proper hermetic eval could look like and what it gives us in terms of capabilities (caching, ...).
I can see how this ensures that at least the files are tracked and marked as updated (by staging them).
That sounds like a sound reason but I can's see how that wouldn't just be possible with direct eval too.
@edolstra could we get some insight from you here?
It's definitely possible, just more work as described in the initial post:
However, we also need to ensure that it's not possible to access untracked files (i.e. we need to check every file against git ls-files).
Nix already has a "eval may only access these store paths" logic, but no "may only access tracked files of this Git checkout" logic yet, so using the former was the simplest solution I assume.
Another important point @rnhmjoj mentioned in Discourse is security. A user can easily unknowingly expose private/secret information globally on a system by building a local flake.
However, we also need to ensure that it's not possible to access untracked files (i.e. we need to check every file against git ls-files)
I guess this could also be implemented by creating a shallow copy of the flake directory (by creating a forest of symlinks to the original source tree rather than really copying it). That could already make things notably faster (not entirely free, but cheap-enough in most cases), and might be simpler to implement.
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/is-nix-2-4-significantly-slower/16218/3
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
Linking here yesterday's chat, which I think is relevant.
tl;dr: IMHO nix develop
should be the exception: impure and not-in-nix-store by default. Just like nix-shell
.
I believe local impure flakes are also very useful if you're constantly editing a file that is in your repo but you don't want the flake to be reevaluated and the file constantly copied to the store. Eventually when you're done you can just not have a --local flag (which can imply --impure as well I guess) or something. One use case I have in mind is having a flake where a python package is in scope but is also editable. I can drop into an appropriate shell with nix develop but it requires the path for the package which is a relative path. Ofcourse for hermetic evaluation you want the whole package source to be copied to the store, but I want to edit the package as I develop and use it at the same time. This means I either have to hardcode the path of my local package to the absolute path in my system which makes things very non-reproducible, or I have to update the flake-inputs constantly which keeps copying the thing to the store.
What @bmabsout just said precisely outlines the whole reason I still haven't adopted flakes yet.
Hermetic eval is very useful for general building etc. but not when I'm in the middle of hacking on things. Nix flakes need to offer the same speed and convenience that i.e. nixos-rebuild -I nixpkgs=... -I nixos-config=...
provides.
I also just encountered this problem. We store large amount of data (several TBs) with git annex
(see our project at https://github.com/umd-lhcb/lhcb-ntuples-gen). Today we just annexed ~100 GB of new data (so the local repo size grows to around 100 GB, without downloading any other previously annexed data) and it took a whooping 8 minutes to finish a nix develop
command, without any changes in flake.nix
.
If I'm reading correctly, reverting back to a nix-shell
based approach with flake-compat
would mitigate our problem until the lazy copy lands. Is that right?
git-annex and LFS are an interesting case here. Should large files be available in flake eval?
@yipengsun Yes, that's correct.
Even outside nix develop
, this seems highly problematic. Couldn't you make use of Git's information to detect what has changed? We have a Merkle tree of the files after all.
@L-as This behavior being problematic is why this issue exists...
git-annex and LFS are an interesting case here. Should large files be available in flake eval?
I'd say no unless the flake itself is used as an output. We currently have no such usecase but it could be nice if we can have something like a .flakeignore
file to explicitly forbidden copy of files in certain paths.
I did a bit more investigation, and found out the slowness of the nix develop
was due to us accidentally added large files directly to git, and copying these files took a long time.
Also, I tried to setup a minimal flake repo to test the availability of the annexed files:
flake.nix
:
{
description = "test";
inputs = {
nixpkgs.url = "nixpkgs/nixpkgs-unstable";
flake-utils.url = "github:numtide/flake-utils";
};
outputs = { self, nixpkgs, flake-utils }:
flake-utils.lib.eachDefaultSystem (system:
let
pkgs = import nixpkgs { inherit system; };
in
{
devShell = pkgs.mkShell {
name = "test-git-annex";
buildInputs = with pkgs; [
git-annex
];
};
}
);
}
I generated a large file (~100 MB) with dd
, then first added it with git annex add
and a nix develop
.
After that, I inspected the /nix/store
:
❯ ls -l
total 18
-r--r--r-- 3 root root 1001 Dec 31 1969 flake.lock
-r--r--r-- 4 root root 538 Dec 31 1969 flake.nix
lrwxrwxrwx 2 root root 202 Dec 31 1969 my_big_file.bin -> .git/annex/objects/05/12/SHA256E-s104857600--f6e654508eac102f1efecae5248ca66ea5193d5edf86c895843188d06deff947.bin/SHA256E-s104857600--f6e654508eac102f1efecae5248ca66ea5193d5edf86c895843188d06deff947.bin
The symbolic link is broken, because, well, files inside .git
folders are not copied over, which is to be expected.
I then tried to unlock the file (see here for more info) with git annex unlock
then git commit
. Now the store looks like this:
❯ ls -la
total 16583
dr-xr-xr-x 2 root root 5 Dec 31 1969 .
drwxrwxr-t 6826 root nixbld 30212 Feb 26 00:18 ..
-r--r--r-- 4 root root 1001 Dec 31 1969 flake.lock
-r--r--r-- 5 root root 538 Dec 31 1969 flake.nix
-r--r--r-- 2 root root 104 Dec 31 1969 my_big_file.bin
And now my_big_file.bin
is a git-annex pointer file:
/annex/objects/SHA256E-s104857600--f6e654508eac102f1efecae5248ca66ea5193d5edf86c895843188d06deff947.bin
To conclude, I think annexed files will NEVER be available for flake eval
I think it would be similar for git-lfs
that the files are not available for flake eval. Because one of the main goal of both git-annex
and git-lfs
is to NOT add large files directly to git, and only the git part of the flake gets copied.
Still trying to workaround this issue, I just found out that all nix
subcommands feature an --impure
flag.
What if, when running under nix develop --impure
, nix resolves paths such as ./.
to the local flake directory instead of copying it to the store and resolving it to there?
That would solve all problems here, and still be predictable because we're passing --impure
explicitly.
Of course, this behaviour should be the same for all other nix commands...
WDYT?
Edit: The feature requested below has been implemented in https://github.com/NixOS/nix/pull/6530/commits/cbade16f9ef1e06b40b379863556157b6222a13b. See also: https://github.com/NixOS/nix/pull/6530#issuecomment-1262580303.
A way to greatly improve the flake developer experience is to allow evaluation from local sources not only for the main flake, but also for arbitrary local inputs.
Very often, local Nix development is spread out over multiple flakes. Examples:
In these cases, the quickest way to evaluate changes in a library flake via a client flake is:
nix eval/build client-flake#output --override-input my-lib /dev/my-lib
But even if the client flake is evaluated from the local source, as proposed by this issue, the library flake would still be copied to the store.
In addition to fixing this issue, add a flag like --local-input <flake input path>
to enable the same local evaluation mode for flake inputs that have a local source.
Example:
nix build --local-input my-nixos-modules .#homeserver.vm
That's already supported with --override-input
. I usually do, for example:
nix flake check --override-input 'poetry2nix' ../poetry2nix/
However it's a valid use case for lazy moving to store indeed.
Their point is that it shouldn't be moved to the store lazily but that it shouldn't be moved at all.
When I'm hacking on something, I want Nix to evaluate the flakes as they are on-disk (just like the nix-
tools do), not the current state of the on-disk git repo copied to the Nix store.
Yes, that's not an idealistic pure hermetic evaluation. I don't need or want that when I'm hacking; I need quick feedback cycles.
When actually deploying things productively (i.e. nixos switch or boot), then I want hermetic eval and don't mind actually committing my stuff or waiting for a copy to be made etc. (Though even there I might just want to quickly deploy the state on-disk to a test profile.)
@yaji, --override-input
is indeed much more convient than the methods I described. I've updated my post.
I don't need or want that when I'm hacking
Or when using git-crypt or similar systems. In this case I introduce an impurity into the local checkout (decrypting the secrets), but I'm totally fine with it because I obviously don't want anyone else to reproduce my secrets. Copying the local checkout to the Nix store is also no good because it would expose the secrets to everyone on the system.
My initial expectation for the Nix flakes was to take the environment (NIX_PATH, channels, etc.) out of the picture, making packages/NixOS configurations self-contained (in a file/directory). They actually go further than that by tying the evaluation to version control, which makese sense in most cases, but it has several unintended consequences, as this issue shows. I don't know how feasible it is to implement this mode of evaluation, but it seems a necessity. Personally I'll never be able to move my configurations to flakes without this.
Here is a simple approach that I have been using for a while to avoid copying the whole project to the nix store every time I commit a change to the project and want to enter a development shell.
First create a new directory in the project, called .nix
for example, and add this directory to the .gitignore
of the main project.
Then, create hard links of the outer project's flake files into the .nix
folder and initialize a new git repository there. Now every time you want a simple development shell for hacking, you can cd
into the .nix
directory and run nix develop
in there (bringing the necessary tools into the environment) and just cd ..
back to the main project and do the development from there. Depending on the project and the files referenced in flake.nix
, some other files like Cargo.toml
or requirements.txt
might also be required to be hard-linked into the .nix
directory.
.
├── Cargo.toml
├── Cargo.lock
├── flake.nix
├── flake.lock
├── .nix
│ ├── .git
│ ├── Cargo.toml
│ ├── Cargo.lock
│ ├── flake.nix
│ └── flake.lock
├── .gitignore
├── .git
├── src
│ ├── foo
│ └── bar
│ └── baz
I don't often change my project's flake.nix
files across branches, but you could easily set up a post-checkout
hook in your main project's .git/hooks
directory with the following content for example:
#!/nix/store/dnd.../bin/bash
ln -f flake.nix .nix/flake.nix
now every time you change to a branch that has a different flake.nix
, the flake.nix
hard link in your .nix
directory also gets updated. A similar procedure can be done for other types of hooks.
Is there any way to let nix build -f . hello
support git submodules? Currently nix build -f . hello
does not have access to files in a submodule.
Regarding https://github.com/NixOS/nix/issues/3121#issuecomment-1120151921, here's my workaround:
src = builtins.path {
path = ./.;
name = "something";
# Filter out nix files to avoid unnecessary rebuilds
filter = (path: type: builtins.match ".*[.]nix" (builtins.baseNameOf path) == null);
};
This way I only pass this modified src
to derivations and avoid constant rebuilds when hacking on flake.nix.
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/tweag-nix-dev-update-31/19481/1
Relevant link from the discourse post above: https://github.com/edolstra/nix/tree/lazy-trees
@yajo I can't seem to find out https://github.com/NixOS/nix/issues/3121#issuecomment-1122322745 actually is used in practice. I have a flake in folder (not a git repository) and there does not seem to be a way to apply this workaround to prevent the folder from being copied into the nix store for every nix run
command (the flake does not use any sources in the directory.
If you really need a fix now, you could try the experimental branch https://github.com/NixOS/nix/pull/6530
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/setting-up-a-new-project-with-nix-use-niv-or-flakes/22427/4
Could someone edit the original post and add a link to https://github.com/NixOS/nix/pull/6530. It's a bit hard to find otherwise.
You might not realize you're using local files, accidentally sneak in state,
The current situation solve half this issue: file knowns to git (only git add -N) are made available fully and not only staged content. So one use can still commit and get build failure. It is still slow.
So either:
git show HEAD:file
for nix file ? (might no be useful if nix code isn't pre-dominant)If we go the second option, we will need to change the copy to match what will be evaluated:
There should be at least a warning that the whole directory contents are going to be put into /nix/store.
I was surprised when nix develop
ended up eating my drive space.
As of now, given that Nix flakes still work by copying to the store (as far as I'm aware), is there any way to make Nix do the copying with hardlinks instead?
Hardlinks wouldn't work, the nix store needs to be read-only immutable files and hardlinks mean permissions are shared and editing one file edits both. If your filesystem supports copy-on-write then that should help, but it won't work if your nix store is on a separate volume (though hardlinks wouldn't work in that case either).
In newer version of Linux, you can reflink between vfs barriers as long as it's the same superblock. Btrfs for example supports this but ZFS does not.
Though I think the big problem with copying is mostly metadata and the associated random access, not the content.
Currently flakes are evaluated from the Nix store, so when using a local flake, it's first copied to the store. This means that
is a lot slower than the non-flake alternative
Ideally, we would copy the flake to the store only when its
outPath
attribute is evaluated. However, we also need to ensure that it's not possible to access untracked files (i.e. we need to check every file againstgit ls-files
).