NixOS / nix

Nix, the purely functional package manager
https://nixos.org/
GNU Lesser General Public License v2.1
12.04k stars 1.47k forks source link

Impure derivations #520

Open copumpkin opened 9 years ago

copumpkin commented 9 years ago

This might reveal a deep misunderstanding on my part, but as far as I can tell, nix fundamentally divides its derivations into "fixed-output" and "deterministic build", based on the presence/absence of outputHash. I'm wondering if there could be a third type of fundamental building block which could allow limited but trackable nondeterministic behavior. The main example I can think of right now is the new fetchTarball builtin, which has its own magic caching strategy, but you could imagine wanting to pull the latest git revision of something using fetchgit and the like. If you use fetchgit as a fixed-output derivation, you can't always get the latest version. If you have it "lie" and pretend not to be a fixed-output derivation, nix will only ever do the work once and not bother refreshing itself.

If nix supported this third type of derivation, I could imagine something like:

{
  fetchTarball = url: builtins.nondetDerivation {
    builder = ./fetchtarball.sh; # contains the actual download logic
    inherit url;
    cachingStrategy = "hourly"; # Perhaps it could take frequency specifiers like this, which would tell nix to incorporate evaluation time into the store hash, or possibly a more flexible mechanism that I haven't yet thought of
  };
}

Of course, it should be possible for you to take an expression and figure out all sources of nondeterminism in it (much like how this source downloader works) so as to better trust the evaluation.

Another possible feature of interest could be the notion of a nondetDerivation optionally (it's not possible with all sources of nondeterminism, but is obviously desirable) emitting some sort of an "anchor" allowing one to tie the nondeterministic evaluation down to something deterministic. Think how ruby's Gemfile ties itself down to Gemfile.lock (but we'd obviously provide hashes), and how when you fetch a git ref you can "lock it down" by resolving that ref to a hash. Another example is how the NixOS channel mechanism resolves the top-level redirect to a precise channel revision. Such an anchor file could then be maintained as a way to lock down nondeterminism to get reproducible system states, but you could also selectively (or in bulk) update the locked things (much like nix-channel --update) to get newer versions.

A last example is just how magic path references in nix copy things into the store for you. We could retain the built-in syntax, but translate the syntax into implicit invocations of the same nondetDerivation primitive.

Is this too weird? I'm just trying to think of a principled way to track my nondeterminism, and possibly to unify the channel world into pure nix.

TBC: I'm not proposing adding more nondeterminism to the system. Just want to be able to track/unify the existing stuff better.

copumpkin commented 9 years ago

cc @edolstra @shlevy

copumpkin commented 9 years ago

To make things even weirder, hydra could use this for its job specification with nondeterministic calls to fetchgit and fetchsvn.

copumpkin commented 9 years ago

Nobody have any comments? I can flesh out the idea more if it would help. I think it could be a pretty cool way to manage the (limited but often necessary) pieces of mutable state in a Nix-based system.

vcunat commented 9 years ago

To sum up, these derivations would:

Do I get this right?

Current status of code generators?

I'm certain there are already general tools that prefetch latest source and update hashes in *.nix files – currently I don't see a distinct advantage in having this built in. For example, @MarcWeber has these REGION AUTO UPDATE things IIRC, and there may be others. Putting the nondeterministic part into a separate tool seems easier to update exactly those things you want and let others locked down (shell-scripting your most common use cases).

Ericson2314 commented 8 years ago

I talked about some similar things in my somewhat-recent fetchgitLocal PR: https://github.com/NixOS/nixpkgs/pull/10176#issuecomment-146610542. I think the interplay between the two derivations (a trick that predates my PR to be fair) is like the "anchoring" you mention.

Ericson2314 commented 8 years ago

From these issues with my new fetchgitlocal, https://github.com/NixOS/nixpkgs/issues/10873 I am starting to think we need non-deterministic packages which run under the current user to generalize things putting like private directories in the store.

copumpkin commented 8 years ago

I'll probably see if I can drum up some interest about this (and flesh out my proposal) at NixCon in Berlin. @Ericson2314, will you be there?

Ericson2314 commented 8 years ago

That would be great! Unfortunately, school will keep me away from NixCon, but let me know how it goes.

copumpkin commented 8 years ago

I've been tinkering with this recently, and might be able to put up a PR for a hypothetical implementation (subject to lots of implementation and design feedback) in the next week or so, if I get some time.

Edit: turned out to be more complicated than expected :(

copumpkin commented 8 years ago

Tagging https://github.com/NixOS/nix/issues/904 for posterity.

shlevy commented 7 years ago

@edolstra I'm considering working on this. Is there any chance I can get some assurance of a timely review and/or permission to merge myself before I put a large amount of work in?

copumpkin commented 7 years ago

I posted this in another ticket:

part of the reason I'm so interested in #520 is that I think that could be a cool model for channels as well as packages. The main properties I want out of a nondeterministic derivation are the ability to (somehow, programmatically) define how often I want it to update, and (most of the time) give myself a way of pinning to a particular version. Think of Ruby's Gemfile and Gemfile.lock distinction: Gemfile (on some level) defines an update policy (via bounds on package versions), and Gemfile.lock is an instantiation of that policy to exact versions that will be reproducible.

Think of what we want from channels:

I want to point to e.g., github.com/nixos/nixpkgs-channels/tree/nixpkgs-unstable (basically an update policy; I want to update at most as often as the branch updates) The branch can be resolved to an exact hash for later reproducibility I want to know explicitly that somewhere in my (otherwise highly deterministic) Nix evaluation, a possibly nondeterministic "moving target" is involved, and be given the opportunity to lock it down to something that point 2 produces I don't know of a great UI for this, but here's one not-so-great one that might inspire other ideas:

When you write a nondeterministic derivation, you generate a UUID and paste it into the expression source Any evaluation of that nondeterministic derivation will get added to a top-level list of sources of nondeterminism in your expression, indexed by the associated UUID, and it's very clear when you evaluate an expression that your nondeterminism is included (so like when the top-level list of things to build and things to download from cache is printed, it could include a third category for these) Any build of a nondeterministic derivation gets a sandbox that allows network access The interface could (at first at least) basically be one that gives you a little "shim" to decide what to feed into a fixed-output derivation. That is, nondeterministic derivation = deterministic FO derivation + "decide (and record) which version to download". That would accommodate many common cases of git hashes and the like. Nix maintains a central registry on your machine of current resolved UUIDs, and lets you request that a particular UUID be updated (this is the equivalent of nix-channel --update) Then this mechanism can be used for channels, Hydra sources (don't have to make VCS into a first-class notion in Hydra anymore), packages that have sensible update semantics, and so on.

I realize this is still pretty sketchy and probably doesn't belong in this ticket, but I do think something in this direction would be a killer feature, allowing us to unify the deterministic Nix world with changing surroundings in a relatively painless manner.

chris-martin commented 6 years ago

So Shea told me about fetchgit today and it seems rather upsetting. It seems convenient sometimes, but is there going to be a config option or CLI flag or something to turn determinism back on? When I run a build, how will I be able to tell whether it's a deterministic one or one with unpinned fetches?

copumpkin commented 6 years ago

Yeah, there's --pure as of a couple of day ago, I think. It should turn off all sources of impurity.

shlevy commented 6 years ago

Internally at Target we expose fetchGit through an interface that enforces specifying either a revision or a tag (we map tags to tags/${tag} in the ref and they're only trusted for internal repos our team controls)

edolstra commented 6 years ago

The motivation why fetchGit doesn't require a hash is that file system access doesn't require a hash either. So evaluation was already impure at that level (you could edit a Nix expression and get a different result).

Ericson2314 commented 6 years ago

@edolstra does --pure affect filesystem access? (E.g. Only paths in already in store, etc.)

chris-martin commented 6 years ago

It seems like --pure maybe should also prevent accessing file paths that are outside of some designated root directory.

copumpkin commented 6 years ago

Why yes, my build does rely on /run/keys, why do you ask?

shlevy commented 6 years ago

--pure disallows filesystem access (except possibly in store). #1816 would reallow it if you know the hash in advance.

Nadrieril commented 6 years ago

Don't __impure derivations (https://github.com/NixOS/nix/commit/647291cd6c7559f68d49a5cdd907c2fd580790b1) resolve most of the issues here ? For @copumpkin's grand idea (which I find super cool), we could allow channels to point to an impure nix derivation instead of an URL. Then we can reuse the channels mechanism, and in particular rollbacks, for impure derivations. And that would only need a relatively small change to nix.

deliciouslytyped commented 5 years ago

Is there any hope of seeing __impure merged into the main branch any time soon?

Ericson2314 commented 4 years ago

@deliciouslytyped ca derivations make __impure a lot better, so we should wait for that.

Ericson2314 commented 3 years ago

ca derivations make __impure a lot better, so we should wait for that.

And now we have them! (https://github.com/NixOS/nix/issues/4087) So let's resurrect this. Should be quite easy, actually.

Ericson2314 commented 3 years ago

Looking at https://github.com/edolstra/nix/commit/690e06b58e19020d69c9fe8bd2d06b45c14f65b5, hare are some notes:

So let's just wait for https://github.com/NixOS/nix/pull/4056 to land, and then we basically "do it again" for this!

CC @regnat

stale[bot] commented 3 years ago

I marked this as stale due to inactivity. → More info

tomberek commented 3 years ago

still interested

stale[bot] commented 2 years ago

I marked this as stale due to inactivity. → More info

MagicRB commented 2 years ago

Still interested

tomberek commented 2 years ago

Still interested

Does https://github.com/NixOS/nix/pull/6227 resolve your use-case?

MagicRB commented 2 years ago

Wow that was a fast response 😆 and yes it does! I want to use them in Hydra actually. Thanks!

Ericson2314 commented 2 years ago

Let's repurpose this to be a tracking issue for the now-merged unstable feature!

MagicRB commented 2 years ago

Let's! I'll start playing with impure drvs soon enough. If I hit any issues I'll report back here.

Ericson2314 commented 2 years ago

I don't have perms to edit the issue or change its title, but impure-derivations is the name of the experimental feature added in the PR @tomberek linked.

melvyn2 commented 2 years ago

I'm not able to use impure derivations at all: nix-build: src/nix-build/nix-build.cc:594: void main_nix_build(int, char**): Assertion `maybeOutputPath' failed. This happens with any derivation that has __impure = true, so is easily reproducible with the example in #6227

{ pkgs ? import <nixpkgs> {}, ... }:
pkgs.stdenv.mkDerivation {
  name = "impure";
  __impure = true; # marks this derivation as impure
  #outputHashAlgo = "sha256"; # optional, default is sha256
  #outputHashMode = "recursive"; # optional, default is recursive
  buildCommand = "date > $out";
}
bryanhonof commented 2 years ago

@melvyn2 I also get that error when running nix-build. But if I use the new cli, nix build --impure ..., it does seem to work?

/tmp/tmp.TMbuOx5fGy 
❯ cat default.nix 
{ pkgs ? import <nixpkgs> { } }:
pkgs.stdenv.mkDerivation {
  name = "impure";
  __impure = true;
  buildCommand = "date > $out";
}

/tmp/tmp.TMbuOx5fGy 
❯ nix build --impure --file default.nix

/tmp/tmp.TMbuOx5fGy 
❯ cat result 
Wed Aug  3 10:03:03 UTC 2022

/tmp/tmp.TMbuOx5fGy 
❯ nix-build 
this derivation will be built:
  /nix/store/2ylp1hynhl3902kjzii9ynvby9ljizwp-impure.drv
resolved derivation: '/nix/store/2ylp1hynhl3902kjzii9ynvby9ljizwp-impure.drv' -> '/nix/store/sm5kqqpsr9v7hk7hdxmhl4kxnd2mc3a6-impure.drv'...
building '/nix/store/sm5kqqpsr9v7hk7hdxmhl4kxnd2mc3a6-impure.drv'...
nix-build: src/nix-build/nix-build.cc:594: void main_nix_build(int, char**): Assertion `maybeOutputPath' failed.
Aborted (core dumped)
physics-enthusiast commented 5 months ago

Would these kinds of impure derivations be permitted in flakes pure eval mode?

tomberek commented 1 month ago

Would these kinds of impure derivations be permitted in flakes pure eval mode?

Yes, this would be safe because the outPath is not deterministic and the eval itself is not impure, only the build-phase.