NixOS / nix

Nix, the purely functional package manager
https://nixos.org/
GNU Lesser General Public License v2.1
12.28k stars 1.48k forks source link

Redefine outputs in terms of language-level "package", not necessarily store-level derivation (RFC-92, and multi-drv packages, docs) #6507

Open roberth opened 2 years ago

roberth commented 2 years ago

Describe the problem

Goals

Currently, packages and derivations are often the same thing, but the lack of a definition and distinction between the two can not continue since RFC 92 (computed derivations, outputOf) and the conflation has unnecessarily made the concept of a multi-derivation package ill-formed / "unthinkable".

Nix doesn't really have a notion of "package". The term is only mentioned in a few places in the code, and only defined in the context of buildenv (ie legacy nix-env). This only related to the usage of derivations in a profile, and therefore does not conflict with a definition of "package".

Nixpkgs on the other hand is all about packages, but it does not define precisely what a package is.

I propose the following definition:

A package is an attribute set with the following attributes:

Notably absent from the definition of a package:

Steps To Reproduce

  1. Define a package where one output comes from a different derivation. You may want to do this to keep derivation dependencies to a minimum (e.g. doc https://github.com/NixOS/nixpkgs/pull/172103 where it would be more desirable for the non-doc outputs not to depend on texinfo).
  2. Be confused about what drvPath should be.
  3. Install the package and note that the output from the non-drvPath output wasn't included.

Expected behavior

Nix represents packages by its outputs and metadata, not the drvPath implementation detail.

nix-env --version output

2.8. Or 2.x really. I would appreciate a major version increase for the (subtle) change in behavior.

Additional context

I came across this problem again in Nixpkgs today and figured I had to share my thoughts. I guess I should turn it into an RFC? I can do that later if y'all agree that we need something like this.

There's also https://github.com/NixOS/nixpkgs/issues/172008 which is really a different problem, but depends on this issue, as this issue defines the interface for what current and future Nixpkgs' must implement.

I can't change the bug label on this issue. It's really a design issue rather than a bug, so another label would be more fitting. Can I have more permissions on this repo?

vcunat commented 2 years ago

Documentation is quite a frequent problem, I think. Well texinfo above is quite cheap, but you commonly have bigger tools like pandoc. If these docs don't get split into a separate derivations, we get more prone to huge "unexpected" rebuilds. So far we often just don't build expensive docs, as many people are used to online resources.

roberth commented 1 year ago

Found this note at DerivedPathBuilt

/*
[...]
 * Note that does mean a derived store paths evaluates to multiple
 * opaque paths, which is sort of icky as expressions are supposed to
 * evaluate to single values. Perhaps this should have just a single
 * output name.
 */
struct DerivedPathBuilt {

Seems like a change worth implementing for this issue. The notion of a single derivation with its outputs seems appropriate at the store level, but up from there, built paths are what matter and there's no reason to tie them to their derivation. By making the suggested improvement, it seems that we get a bit closer to multi-drv packages.

Ericson2314 commented 1 year ago

@roberth in the later RFC 92 patches I do indeed make a SingleDerivedPath so only the last step (baz) of a foo^bar^baz chain is potentially multiple paths. So we can consider using that more. On the other hand the wildcard ^* means we cannot get rid of the multiple one completely (if the DRV is unbuilt yet we cannot resolve the *).

roberth commented 1 year ago

@Ericson2314 I would consider ^ to be more of a "power user" thing, because packages should use outputOf to hide it. I don't think foo^* is useful within the Nix language; at least not with the way we currently represent outputs in the package attrset. This doesn't seem to be a great loss, because the package expression could presumably force the inner derivation to provide all requested outputs, returning an empty directory for output if the inner derivation determines that some output isn't useful. So I don't think ^ affects usability too much. It just makes the CLI-level logic slightly more complicated in a few cases, but nothing too crazy.

I've updated the issue title and description to clarify the goal of the issue.

Ericson2314 commented 1 year ago

@roberth Oh sure, What I just mean is that the comment you referenced above is quite likely one I wrote! :) And the SingleDerivedPath is close to its resolution.

You might want to take a look at https://github.com/NixOS/nix/issues/7261. I agree ^ should be not need by regular users / we should make computed derivations ones not need to be used differently. I have just been trying to wrap my head around the plumbing (which is subtle enough!) before we get to the porcelain.

In a way, this issue here could be a joint effort between the Nixpkgs Architecture team and Nix team because the cross-cutting concerns invovled.

Ericson2314 commented 1 year ago

https://github.com/NixOS/nix/issues/7467 somewhat relates to this.

Ericson2314 commented 1 year ago

So currently in the docs we have "store derivation" and "derivation", and what I like about this is it completely decouples the logic:

blaggacao commented 1 year ago

Notably, the proposal [to define a "package" data type that nests derivation(s)] would solve laziness of meta (and passthru). And obsolete nixpkgs's recently added lazyDerivation.

That means, you can then peak, for example, at meta.description without almost certainly risking evaluation of drvPath, as well. In a heavy IFD case and short of any proposed solutions to IFD, that's a heck of a cost.

That means recovering metadata from packages finally gets the competitive pricing it deserves that is more closely related with its true production cost.

roberth commented 1 year ago

meta.outputsToInstall has already set a precedent for the expression-level package to be different from the derivation, but to a "lesser" degree. This issue can be seen as a suggestion to lean into that distinction and make it more useful.

fricklerhandwerk commented 1 year ago

I can't tell why this would have to be a Nix language concept. What precludes Nixpkgs of making that abstraction on top of derivation?

roberth commented 1 year ago

What precludes Nixpkgs of making that abstraction on top of derivation?

The CLI uses drvPath instead of outputs and its corresponding attributes.

I can't tell why this would have to be a Nix language concept.

The language itself remains unchanged. derivation, or even better derivationStrict will keep supporting the output-related parts of these attribute sets.

Ericson2314 commented 1 year ago

@fricklerhandwerk Also check out https://github.com/NixOS/nix/issues/ There is a tension between these too things:

  1. The low level store path installables ought to be explicit as possible. Stuff like .drv punning is bad for programmatic usage like cat paths | xargs nix blah where we want the same behavior on every store object.

  2. The high level installables we want to be ergonomic for typing, even if it makes their usage more complex.

The easiest way to resolve this tension is probably to cut the cord between them: different idioms for different level of abstraction, disjoint terminology (package vs derivation).

nrdxp commented 1 year ago

I can't tell why this would have to be a Nix language concept. What precludes Nixpkgs of making that abstraction on top of derivation?

Probably not strictly necessary, but it may be useful for this to be a language level concept. It would essentially encode the same knowledge as a profile, but a posteriori. Since a "useful package" is really the final aim of why we use Nix in the first place, it makes sense for it to be at the root.

It may also have the effect of making our current terminology more intuitive. We have derivations, but what are they derived from exactly? A language level package construct makes the answer tangibly obvious.

Ericson2314 commented 1 year ago

We have derivations, but what are they derived from exactly?

The derivation describes how something is derived, but yes it is an awkward term because "derivation" sounds like we are looking back (posteriori again :)) yet we have to make the plan before we build the thing and then wonder where it comes from. "Recipes" might have been a better word.

Regardless, yes whatever we come up with for the language level should definitely be something we teach and think about.

fricklerhandwerk commented 1 year ago

I agree with the general direction, but have both high-level as well as low-level questions.

  1. How does this (morally) differ from the outputs attribute of a flake?
  2. Why are the outputs in this proposal a list, but tests and devShell are separate attributes?

a "useful package" is really the final aim of why we use Nix in the first place

We may as well be using NixOS or home-manager for that purpose, which essentially piggy-backs on Nix plumbing as if it was a fancy library to do programming with files. To phrase it even more drastically:

  1. Why does Nix have to take care of packages at all, instead of focusing on providing a very clean interface to and implementation of the store and derivation mechanism – an let others (e.g. Nixpkgs) define their own notion of a package on top of derivations?

Maybe we just have to tell a better story around the different architectural layers of Nix to make the distinction clear in order to make these questions go away. There are actually three distinct use cases and mechanisms that build on top of each other. They each have their own, independent semantics combined under one name "Nix":

  1. Build management (Nix store, derivations)
  2. Configuration management (Nix language, flakes in the role of a composition mechanism)
  3. Package management (Nix shell, the notion of installables, flakes in the role of package declarations, ...)

The current narrative (in the manual for instance) barely suggests any distinction between these three, and even our informal way of speaking blurs the lines most of the time. Notably, the bottom layers are clearly independent from the ones above.

Maybe I'm confused about this proposal because it doesn't clearly state the place for which part this change is relevant. @roberth you now clarified that this is really about the package management CLI.

Maybe I'm a bit frustrated that the package management aspect, while it was originally one of the main motivations behind Nix, has historically and currently been covered really well by other tools building on top of the lower layers, and yet we still spend so much energy on centralising it within Nix itself. Maybe this is counterproductive, and I'll stop now.

roberth commented 1 year ago
  1. How does this (morally) differ from the outputs attribute of a flake?

The outputs of a package are about producing files, whereas the outputs of a flake are about providing arbitrary expression-level values.

2. Why are the outputs in this proposal a list, but tests and devShell are separate attributes?

It specifies the existing builtins.derivation behavior rather than a first principles redesign.

3. Why does Nix have to take care of packages at all, instead of focusing on providing a very clean interface to and implementation of the store and derivation mechanism

This project / repository already encompasses multiple layers, roughly represented by the internal libraries. I think you have correctly identified that the build aspects and package management aspects can and probably should be considered to be separate layers.

To what degree we want to act on this possible separation is a matter of practicality and priorities. I believe the other projects are already served quite well by Nix. A possible downside of spinning off the package management is a fragmented user experience. Without a standard for packages, we'd have to deal with incompatibilities and the confusion that it causes.

Maybe this is counterproductive, and I'll stop now.

This was a good conversation to have, but not something we should act on now. We can keep the idea in mind and consider (at times) whether it would apply well in practice.

Ericson2314 commented 1 year ago

@fricklerhandwerk So I think a practical thing we can do re layering is just worry about the store-only installables first. That is a big idea behind https://github.com/NixOS/rfcs/pull/134 for example -- the store only CLI should stand alone. And when in https://github.com/NixOS/nix/pull/7600#issuecomment-1384152027 @edolstra points out that means things might get "out of sync", I say "fine for now!".

fricklerhandwerk commented 1 year ago

The outputs of a package are about producing files, whereas the outputs of a flake are about providing arbitrary expression-level values.

This is a valuable insight that was non-obvious to me. It finally makes a meaningful distinction between flakes (for composing Nix expressions), derivations (units of computation, i.e. build tasks, in the Nix store; subsequently their outputs, i.e. derived store objects), and packages (a set of file[system object]s based on derivations, regardless of their Nix store representation).

This may be in fact a reasonable approach to re-tell the whole flakes story.

blaggacao commented 1 year ago

derivations (units of computation, i.e. build tasks, in the Nix store; subsequently their outputs, i.e. derived store objects)

It's a language level API into the file system.

Call it builtins.kubectl and it would be a language level API into a well known control plane.

They would be same in kind and kin, except file system objects are foreign to the latter.

mstone commented 1 year ago

+1 to the insightful comment above about flakes as a (more) reproducible medium of exchange for nix language express and the full range of values they compute.

Also, re the larger layering discussion above, I have a framing suggestion which is: while so far as I know, nix is most commonly used today to help manage the "hierarchy of abstraction of 'running software'" (my term from this note from 2014) at the "builds" and "packages" levels (to use Robert's terminology) via a uniform syntax, semantics, and interface that spans layers, I still find it clarifying to situate its value at those layers by framing it as a set of insights and a derived technology + interface that spans many more of these layers, both higher up, lower down, and "sideways", including:

(Thus: maybe the initial widening of the field of regard is good, and perhaps even more widening is (or can be, for some purposes here) better?)

7c6f434c commented 1 year ago

Since a "useful package" is really the final aim of why we use Nix in the first place

It depends. I care about isolated and predictable — and flexible — more than about «useful» beyond giving me the files of the package inside the outputs.

it makes sense for it to be at the root.

That needs better layering. I believe that if the question is a suitable override in a Nix expression, then weird corner cases in cross-compiled packages or strange toolchains will be fixed. The more is put into Nix itself, the more risk there is to miss some corner cases and make them hard to fix.

Ericson2314 commented 1 year ago

@7c6f434c It may sound like this issue could be baking in yet more policy into Nix, but I would like it to be about baking less policy into Nix. Our current { type = "derivation"; outPath = "...."; drvPath = "...."; } idiom rigidly corresponds to builtins.derivation in a way that I don't think is very good or useful. The goal would be to make it be a more flexible building block, but not a fancy user-facing thing unto itself.

7c6f434c commented 1 year ago
DavHau commented 1 year ago

Outputs should not be dynamic attrs

${output} for each output in outputs: store path string

This introduces an infinite recursion whenever the outputs are computed from other attributes of the package. For example, if the outputs are computed by looking at the source tree, and the source tree is fetched using the packages name & version.

Of course one could add another indirection to prevent that infinite recursion, and specify the name/version elsewhere but it feels like an unnecessary workaround.

How about the following modifications to the proposed package attrset:

roberth commented 1 year ago

Non-dynamic outputs is a significant change

For example, if the outputs are computed by looking at the source tree, and the source tree is fetched using the packages name & version.

In a Nixpkgs context, iiuc this works: mkDerivation(finalAttrs: { outputs = f finalAttrs.src; src = g finalAttrs.name finalAttrs.version; }). Problems do arise if you take any of those attributes from higher fixpoints that use the mkDerivation result. So this is not quite as urgent as you suggest, but I agree that the less strict we can make the derivation- or package function, the better. Similar restrictions fundamentally apply to passthru as well, fwiw.

Your suggestion does align nicely with the general idea of the JSON guideline.

However, I'm concerned about the practical feasibility and cost/benefit of making this change. The changes to outputs and ${output} would have to be implemented in expressions for this to work. We can make additions to the package definition, but I don't think we should change it to the point where existing packages don't work anymore.

Wrapping outputs with their package attrs is slow (and currently buggy)

For performance, it would be great to get rid of the output selection logic that returns a whole new package. Rewrapping the output into a package is costly, because it has all the expectations of a proper package, making the cost of a correct implementation equal to overrideAttrs rather than //. Optimizing output selection is more of an expression / ecosystem change, as I don't think Nix would have to care much about what else is in an output except the string coercion attribute / attributes. However, this ecosystem change can be combined with the implementation of the new-style outputs.

roberth commented 11 months ago

meta and "passthru" strictness, data model, what should be top level?

For what it's worth, meta is wrong from a strictness perspective. pkg.meta makes the value of meta strict in pkg, where pkg is a non-trivial computation that may not even work.

It could be argued that a better representation of the (human) package concept is to start with meta, and perhaps have an instantiatable derivation inside of it, inside an attribute. However, this would be unpleasant to use. Perhaps the strictness issue is better to be fixed by making sure that the package attrset is always cheap to compute. This means recognizing typical use cases, providing a standard attribute for each use case, which is always allowed to exist, but may be null. This way, a package can always be defined with a function body that's an attribute set literal, without // and without dynamic attributes. Although it's not pretty, it works around the strictness problem rather well, solving such issues as

fricklerhandwerk commented 11 months ago

An attribute set for a package is reasonable, and that's also what the flake schema goes for. But all this is an issue for Nixpkgs, because that's where Nix-language-level metadata is currently lumped together with build configuration, and where a more scalable convention should be established first. There is little Nix can (or should, IMO) do about that.

The recently renewed design discussion around lockfiles could offer a place for statically displaying outputs (including metadata) as well, so they don't require computation to determine on the consumer's end. Then we don't have to artificially restrict expressive power in the package declaration itself.

roberth commented 11 months ago

Is this discussion in scope for NixOS/nix?

But all this is an issue for Nixpkgs, because that's where Nix-language-level metadata is currently lumped together with build configuration

Currently there's an implicit interface between Nix and Nixpkgs, and that's the topic of this issue. If I were to put this issue in Nixpkgs, I'd get the exact opposite response because there maintainers expect Nix to lead in such changes. Iirc if you ask Eelco he would say Nix defines the DSL and Nixpkgs just implements that, which would be consistent with my assumption about Nixpkgs maintainers.

Unless you create a repo nixpk for this issue - between nix and nixpkgs - I will keep collecting thoughts here. If you disagree, I kindly ask that you just ignore this issue. I have zero appetite for a meta-discussion that's going to be even less relevant than my notes, interspersed between here.

roberth commented 11 months ago

The recently renewed design discussion around lockfiles could offer a place for statically displaying outputs (including metadata) as well, so they don't require computation to determine on the consumer's end. Then we don't have to artificially restrict expressive power in the package declaration itself.

Wouldn't this create a need to update the lock file whenever the expressions change? I don't think using the lockfile as a cache is a good idea, and I'd be happy to explain that in a suitable issue/thread. IFD is one reason.

Another problem with this is a data model problem that I didn't illustrate at all. It's the problem of the linked comment. What if your update tooling needs to read meta while the package can not be evaluated yet. Similarly, what about a package that can only be evaluated on a certain system? Or any system? Much of meta is not dependent on system whereas all instantiation is completely dependent on system. Nonetheless you have to pass a system (explicitly or implicitly by accessing an attr, if you're lucky to have one) in order to get meta. A lockfile based solution would still suffer from this problem.

Ericson2314 commented 11 months ago

Yeah sounds like a unit in RFC-140 speak is closer to a pair of a package function and a meta; much/all of the meta should not depend on parameters of the package function.

I do agree that even if some things are Nixpkgs only, and we have a Nixpkgs CLI to deal with those, we do need something else for Nix<--->Nixpkgs and that is what this issue deals with.

nixos-discourse commented 10 months ago

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/flakey-profile-declarative-nix-profiles-as-flakes/35163/3

nixos-discourse commented 9 months ago

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2023-12-08-nix-team-meeting-minutes-110/36721/1

nixos-discourse commented 7 months ago

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2024-02-26-nix-team-meeting-minutes-128/40496/1