Open copumpkin opened 8 years ago
I'd prefer to use substitutes instead of sub-channels, i.e. solve 3. and thereby bypass 2.ii.
Different people will be interested in different autogenerated parts. A least intrusive mode might just abort during evaluation if the autogenerated data isn't present locally (with a useful message), and users could regenerate them explicitly for particular nixpkgs revisions and/or there could be hooks for such to be done during nix-channel --update
. Of course, the regeneration would be implemented just as a nix build and typically simply substituted.
For my personal usage, it's acceptable for nix to fetch not-too-big data during evaluation but not to auto-run some complex generator.
If @shlevy's work is hard to rebase or review, then yes I would do the abort-generate thing in a heartbeat to unstick this.
Note that unless we are OK with pegging a specific version of hackage/npm/whatever (and updating that manually when we want to update it), we'll have to go beyond just improving IFD into abandoning the idea that nix evaluation is deterministic (while keeping nix building deterministic). One possible way to do that is to have some way to specify non-deterministic inputs to the top level eval, and have hydra extract those as inputs and have command line nix fetch the latest and have arbitrary callers able to pass in fixed revs or fetch the latest as they wish.
(I have some ideas on how to do that well, if I get a pre-approval from @edolstra I can get started on that after the perl stuff is done)
I was proposing to peg specifically to an exact version, since I don't think we should abandon the determinism (except in my #520 thing which should behave quite differently). My ideal would be as follows:
To clarify, I'd then see the periodic updates that @peti makes to haskellPackages
today as either bumping the fixed-output derivation that produces the input to cabal/hacakge2nix (updating the git rev and sha256) or bumping the source code for cabal/hackage2nix, or possibly just changing some of the overrides. That would switch the massive diffs into tiny diffs that explain in fairly minimal terms what changed.
Please note that we have a concrete use-case in Nixpkgs master
today that can serve as an example what the problem is that we need to solve. Users can generate Nix expressions for any Haskell package on-the-fly using the callHackage
function. For example:
$ nix-shell -p 'haskellPackages.ghcWithPackages (p: [
(p.callHackage "hsdns" "1.6.1" {})
])' --run "ghc-pkg list hsdns"
/nix/store/zp1j6fz2nk7g07qvizx6lzym6lnhn7l2-ghc-8.0.1/lib/ghc-8.0.1/package.conf.d
hsdns-1.6.1
The expression used to build that hsdns
library is generated automatically and imported at evaluation time:
$ cat /nix/store/gz6qlvwdhm1b9v64i3xkzhzfd0g6r3qb-cabal2nix-hsdns-1.6.1/default.nix
{ mkDerivation, adns, base, containers, network, stdenv }:
mkDerivation {
pname = "hsdns";
version = "1.6.1";
sha256 = "64c1475d7625733c9fafe804ae809d459156f6a96a922adf99e5d8e02553c368";
libraryHaskellDepends = [ base containers network ];
librarySystemDepends = [ adns ];
homepage = "http://github.com/peti/hsdns";
description = "Asynchronous DNS Resolver";
license = stdenv.lib.licenses.lgpl3;
}
This feature allows to support, basically, all of Hackage without having to check all of Hackage into the Nixpkgs repository. The only drawback is that -- as of now -- Hydra won't build a single binary that depends on this feature.
[aside question]
@peti Do callHackage
look at the stack.yaml file of the package ? I have just tried
nix-shell -p 'haskellPackages.ghcWithPackages (p: [ (p.callHackage "language-puppet" "1.3" {}) ])' --run "ghc-pkg list language-puppet"
and get:
Setup: Encountered missing dependencies:
http-client ==0.5.*, servant ==0.8.*, servant-client ==0.8.*
This is correct because these deps are not available in stackage nightly yet but there are in hackage (PS: building `language-puppet-1.2 works flawlessly which is quite impressive).
@PierreR, I'd rather not get into discussions here that are off-topic for the issue since I'm worried it might derail the thread.
@peti AFAIK, Hydra will build such packages. The Hydra evaluator does not prevent import-from-derivation, in fact it's used by the RPM/Debian closure generation functions. However, it's probably not a good idea to use this "feature", since the build (including its dependencies) will be done by the evaluator rather than the queue runner.
Sounds like the next step sending such derivations to the queue runner?
@edolstra: it's been reported NOT to work due to restricted mode: https://github.com/NixOS/nixpkgs/issues/16130#issuecomment-226784880
The problem I encountered (https://github.com/NixOS/nixpkgs/issues/15480) is that it doesn't allow network access during evaluation, and therefore cannot retrieve the repo with hashes.
I'd really like to see this worked out. From reading around the associated issues, the most egregious problem with restricted mode is that while network access is allowed in fixed-output derivations used normally, network access is not allowed in fixed-output derivations being imported, right?
I consider removing this restriction priority number 1 here because there is no impact to purity or other such downside.
(@peti I fixed the typos if that was whats confusing—sorry there were so many in the first place.)
I reacted with confusion because I felt your summary of the situation does not represent very well what's been discussed in the thread.
@peti Oh! Well I wrote my summery because I didn't see one so far and this was the best I could come up with. Is the restricted-mode issue that prevents callHackage
binaries from being built something different?
Putting in a big 👍 for this. opam2nix is probably not yet in a state to be merged in nixpkgs proper, but this issue is one of my big worries - reworking all of my code to live inside nixpkgs and manage it there instead of in its own tree is a big switch to make, and it's not clear how I'd maintain both going forwards.
I assume any big self-contained work on bulk-importing third party language dependencies could benefit from this approach, instead of having to be managed in-tree.
Garbage collections seems to delete import-from-derivation sources. :confused:
Are there any workarounds applicable right now or do we have to wait for a fix in Nix?
@michalrus They get gc'd during an evaluation?
@shlevy, no, no. I have a few import (import (import …
s in a Haskell project (built by Cabal in a nix-shell
) and also, in configuration.nix
of that developer notebook:
{
nix.gc = {
automatic = true;
dates = "daily";
options = "--delete-older-than 30d";
};
}
After I added this auto GC, each morning, when starting nix-shell for that project, I have to redownload all Nixpkgs versions that it’s pinning (they got GC’d during the night) and some other sources that we add by callCabal2Nix (fetchFromGitHub …)
.
That nix-shell
does create its own GC root, being called like nix-shell --add-root /home/m/thatProject/dist/nix/shell.drv --indirect --pure --run $cabalCommand
, which is noticeable, because Nix is only re-downloading sources, and not rebuilding the deps. So the final deps in binary form are indeed cached. I also set gc-keep-outputs = true
in /etc/nix/nix.conf
.
This feels a bit awful. :C What if I want to go for a semi-vacation to a place where there’s no or very limited Internet, and I forget to turn off that auto GC the day before? :sob:
Please, these are real use cases, however grotesque they might seem. :pray:
I see. There's no easy way currently to do this, sadly, you'll need to manually lift up all the imported derivations to add them as roots somewhere. Would be a good feature though.
Okay, thank you for this idea. :cry:
I also make heavy use of import-from-derivation, so this would be very nice to solve/fix (I hadn't noticed this myself since I garbage collect so infrequently).
There's no easy way currently to do this, sadly, you'll need to manually lift up all the imported derivations to add them as roots somewhere. Would be a good feature though.
It seems like there are two conflicting desires here:
For example, we can use import (runCommand ...)
to generate a .nix
file and import it. The resulting Nix value will not depend on that runCommand
invocation, i.e. it will not affect any subsequent hashes. This is useful if multiple runCommand
derivations produce the same output.
I personally use this to check for the latest commit in a git repo: I put builtins.currentTime
in the environment of the runCommand
derivation so that a new derivation get built each time. Of course, most of the time there have been no new commits, so the generated .nix
file will stay the same, and any derivations which use that git repo will have the same hash as before and be taken from the cache.
If imported values did depend on the derivations that generated them, then my projects would keep getting rebuilt all the time, despite having no changes, due to the currentTime
propagating through the hashes.
This is the issue @michalrus has just described. Note that there are two parts to consider: surviving garbage collection when a resulting derivation is referenced by a GC root; and getting garbage collected when those derivations aren't being kept any more.
I don't know too much about the GC mechanism, but I wouldn't want old derivations to build up in the store due to "stale" auto-generated GC roots keeping them alive.
For example, if I my user profile depends on a package built using this latest-git-revision function, then I update to a new profile and GC the old one, I would want that old .nix
revision and its dependencies (e.g. old versions of git, etc.) to get GCed at the same time.
@Warbo ++
@shlevy, it also seems that neither overrideCabal
nor callCabal2nix
cause their inputs to survive GC. E.g. if I have:
haskellPackages.override {
overrides = self: super: {
cryptonite = haskell.lib.overrideCabal (super.cryptonite) (…);
};
}
… then super.cryptonite
will be GC’d.
Similarly for:
haskellPackages.override {
overrides = self: super: {
steeloverseer = self.callCabal2nix "steeloverseer" sources.steeloverseer {};
};
}
… here, after GC, sources.steeloverseer
won’t be re-downloaded (because I add everything in sources.*
to a GC root), but all build deps of SteelOverseer will be re-downloaded. :sob:
Perhaps IFD could "infect" .drvs derived from them, without changing the output hashes, somehow?
@shlevy I think that's a slightly different sort of infection that I also wouldn't mind contracting 😄
But the same mechanism could be used.
I like the look of 1052. Note that it can be useful to generate-and-import more than just derivations, e.g. we might generate an attrset of names/hashes from some other tool (cabal, npm, etc.), or we might have a script which checks for some condition and outputs a boolean; etc.
The ability to "poison" arbitrary values would presumably require trickier changes in Nix. Maybe an easier approach would be to only support attrsets: this would include derivations "for free", and for other values we can just wrap them up, e.g. to get a boolean we could wrap it up as { my-condition = true; }
and the import might "poison" this set to get something like { my-condition = true; auto-generated-poison-attrs = { imported-from-dependencies = [ /nix/store/... ]; ... }; }
Re
here is the fix for this problem
{ pkgs, ... }:
{
# This function makes a copy of package and adds `.nix_runtime_deps_references` file containing links to other packages
# This function can be used to prevent garbage-collection of packages that were generated/downloaded during import from derivation (IFD) and make them last as long as the imported package do.
# Check https://github.com/NixOS/nix/issues/954#issuecomment-365281661 for more.
# Check https://stackoverflow.com/questions/34769296/build-versus-runtime-dependencies-in-nix how runtime dependencies work.
# Example:
# {
# environment.systemPackages =
# let
# packageSrc = fetchFromGitHub { owner = ..., .... };
# package = import "${packageSrc}/release.nix";
# in
# [
# package
# ]
# }
# Here the `package` will not be garbage-collected on next `sudo nix-collect-garbage -d`, but `packageSrc` do
# But with
# {
# environment.systemPackages =
# let
# packageSrc = fetchFromGitHub { owner = ..., .... };
# package = import "${packageSrc}/release.nix";
# imrovedPackage = addAsRuntimeDeps [packageSrc] package;
# in
# [
# imrovedPackage
# ]
# }
# neither `package`, not `packageSrc` will not be garbage-collected
# TODO:
# make addAsRuntimeDeps composable with itself by appending links if drv already contains nix_runtime_deps_references
# E.g. addAsRuntimeDeps [src2] (addAsRuntimeDeps [src1] drv)
addAsRuntimeDeps = deps: drv:
let
fileWithLinks = pkgs.writeText "fileWithLinks" (
pkgs.lib.concatMapStringsSep "\n" toString deps + "\n"
);
drvName = drv: builtins.unsafeDiscardStringContext (pkgs.lib.substring 33 (pkgs.lib.stringLength (builtins.baseNameOf drv)) (builtins.baseNameOf drv));
in
pkgs.runCommand (drvName drv) { } ''
${pkgs.coreutils}/bin/mkdir -p $out
${pkgs.coreutils}/bin/cp ${fileWithLinks} $out/.nix_runtime_deps_references
${pkgs.rsync}/bin/rsync -a ${drv}/ $out/
'';
}
Usage example
# it's utils, not lib, because nixpkgs lib doesn't depend on pkgs
pkgs: pkgsOld:
let
callUtil = file: import file { inherit pkgs; };
in
(callUtil ./addAsRuntimeDeps.nix)
nixpkgs = {
overlays = [
(import ../utils/overlay.nix)
];
};
I marked this as stale due to inactivity. → More info
I closed this issue due to inactivity. → More info
Given the emphasis of the original issue on Nixpkgs, I think it is worth mentioning https://github.com/NixOS/rfcs/pull/109
See the whole thread at https://github.com/NixOS/nixpkgs/issues/16130 plus the ensuing IRC discussion.
As a quick recap: I and several other people see IFD as a pretty important prerequisite for Nix's continued growth, but there are a few technical issues standing in its way right now that we don't have clean answers for.
What we'd like to be able to do:
What hinders it right now:
stdenv
or similar during evaluation?nixpkgs
evaluation involves, and whether a rebuild will be necessary? Definitely, but we could write tooling...?stdenv
itself depends on one of these things? The darwin stdenv uses llvm, which depends on python. It currently doesn't use any pythonPackages but who knows what might happen in future?