NixOS / nix

Nix, the purely functional package manager
https://nixos.org/
GNU Lesser General Public License v2.1
12.72k stars 1.51k forks source link

Overload `/` to mean path append; alternative to `path + "/" + "foo"` footgun #7301

Open roberth opened 1 year ago

roberth commented 1 year ago

Is your feature request related to a problem? Please describe.

In the Nix language, + associates in such a way that path + "/" + "foo" evaluates to path + "foo" for any path value path.

6530 pushes people towards this syntax, so perhaps it's worth considering to give them a concise alternative that doesn't lead to almost inexplicable malformed paths every now and then.

Describe the solution you'd like

Instead of

nix-repl> "a" / "b"
error: value is a string while a float was expected

nix-repl> ./. / "b" 
error: value is a string while a float was expected

evaluate to

nix-repl> "a" / "b"
"a/b"

nix-repl> ./. / "b" 
/home/user/src/nixpkgs/b    # assuming such a ./.

Describe alternatives you've considered

Add builtin fetchers to the string context, and add fixed-size holes to strings, so that "${fetchTree foo}/bar" does not immediately fetch the tree. Holes are a lot to ask though. Perhaps better phrased as a "rope" of strings and lazy tree thunks. A rope, or lazily concatenated string representation might not be a bad idea?

Additional context

infinisil commented 1 year ago

This is a bit problematic because foo / bar (this is a path concatenation of two variables) would then have a different meaning from foo/bar (this is a path created from two string components)

ncfavier commented 1 year ago

I wonder if something like ${foo}/${bar} could work?

infinisil commented 1 year ago

I should also point out the recently merged https://github.com/NixOS/nix/pull/5066, which actually supports almost this:

nix-repl> foo = "foo"

nix-repl> bar = "bar"

nix-repl> /${foo}/${bar}
/foo/bar

I feel like support for ${foo}/${bar} could also be implemented

roberth commented 11 months ago

We could at least do this in C++ actually. SourcePath's + operator freaks me out every now and then. / wouldn't.

@edolstra wdyt?

roberth commented 9 months ago

Discussed in meeting today

- `path / "string"` is a good idea (@Ericson2314, @edolstra, @roberth) - `"string" / "string"` probably good for consistency. - We want to use string for "abstract" paths such as those in NixOS configs - Slightly trickier `"1.0" / "2.0"` - Feels like a gimmick, but solves a real problem - @Ericson2314: let's talk to the tvix people as well
nixos-discourse commented 9 months ago

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2024-02-05-nix-team-meeting-minutes-121/39340/1

infinisil commented 9 months ago

See also the newish lib.path in Nixpkgs, which could be used for some inspiration: https://nixos.org/manual/nixpkgs/stable/#sec-functions-library-path

roberth commented 9 months ago

There could be a system to this:

I'll leave it at this, because Nixpkgs library design should probably be discussed over at nixpkgs instead.

Thanks for linking this @infinisil

bb010g commented 6 months ago

What are the benefits of introducing an overload for / compared to extending path interpolation? path / "string" becomes ${path}/string or ${path}/${"string"}; "string" / "string" becomes string/string or ${"string"}/${"string"}. The interpolation syntax more clearly reads as using paths to me. You also mention "1.0" / "2.0" above as tricky; 1.0/2.0, more confusing syntax in my opinion, already constructs a path in stable Nix, and ${"1.0"}/${"2.0"} is much harder to confuse for attempted numeric division.

roberth commented 5 months ago

What are the benefits of introducing an overload for / compared to extending path interpolation?

I'm not opposed to extending path interpolation, but I don't like that we can't tell beginners that ${} copies paths to the store anymore, as "${foo}/bar" copies, while ${foo}/bar does not. This is not a new problem (although ${} in paths is a fairly recent addition, fwiw).

"string" / "string" becomes string/string

I think you meant "string/string"; the latter is a path literal.

or ${"string"}/${"string"}.

This violates the property that a path literal produces a path value.

"1.0" / "2.0"

I don't think this is all that important. It's highly unlikely that someone would quote their numbers, so this is supposed to be representative of an expression a / b where the source of both a and b might be of the wrong type by accident. Divisions are rare, and divisions where both operands come from unreliable user input even more so. Even then, a mistake like this is easily prevented by using something like the module system.

It is fair to say that by allowing ${}/${} instead of allowing paths in /, we turn a bad value into an error, which is good, but I don't think this benefit is significant, because Nix is much more about path manipulation than numeric computation. Path operations can be confusing, especially for beginners, so providing a simpler alternative is more valuable than catching a very rare type error.

bb010g commented 5 months ago

@roberth Thanks for the clarification; I was trying to interpret your earlier meeting notes. It was not clear to me that a path append expression's result's type depends on its LHS's type.

To clarify, is this an accurate typing?

{-# LANGUAGE FunctionalDependencies #-}
class Div lhs rhs result | lhs -> result
  (/) :: lhs -> rhs -> result

instance Num a => Div a a a
instance Div String String String
-- explicitly not an instance:
-- instance Div String Path String
instance Div Path String Path
instance Div Path Path Path

The above presumes Path only includes absolute paths.

"1.0" / "2.0"

I don't think this is all that important.

I agree that this isn't important; I brought it up because your meeting notes did.

"string" / "string" becomes string/string

I think you meant "string/string"; the latter is a path literal.

or ${"string"}/${"string"}.

This violates the property that a path literal produces a path value.

These could be made to work if relative path values were introduced as a subset of path values, to complement the current absolute path values, but that'd break cases like toString foo/bar, which currently stringifies an absolute path value. In addition, you'd want some way (probably syntax) to anchor relative paths & make them absolute. A more realistic alternative would be to introduce relative path syntax, but I don't know what would look good for that.

Path operations can be confusing, especially for beginners, so providing a simpler alternative is more valuable than catching a very rare type error.

Agreed here. I think a better Nix would have a built-in URL type, handle path literals like ./foo as relative URLs, and offer easily accessible URL operations, but that's not the Nix we have. I think that quiet desire for a proper URL type can be seen in Nix's historical bare URL syntax, which parsed bare URLs into strings instead of rich URL values.

You can also see effective relative path values showing up again in lazy trees, where you have absolute virtual path values that are effectively relative paths paired with an absolute virtual path root. Certain operations can coerce those absolute virtual paths into absolute store paths.

Pure flakes in current Nix already work with non-store path values, where non-store paths can be used & manipulated as long as they stay in the Nix level (for the most part). Accidentally coercing one of these paths into a store path is often easier to miss or ignore because normally it only results in copying a file/tree from the flake you're working in, not potentially a full Nixpkgs source tree.

I think Nix needs to teach a consistent, understandable set of rules for both when paths are coerced to be store-relative and what file/tree is the store path root (i.e. the relevant child of /nix/store/) whenever that happens.

If we can guarantee that path appends via / never directly coerce paths to store paths, that's a good start. I don't think we can guarantee that path appends via / never effect the store path root of future coercions into store paths, as AFAIK lazy-trees doesn't over much control there.

I believe that path appends via path interpolations would also never directly coerce paths to store paths.

A reason I brought up extending path interpolation, besides avoiding confusion with numeric division, is that currently expressions like "${nixpkgs}/lib" may be used to reference paths under flakes. With current lazy-trees, AFAIK, that would coerce the nixpkgs / "lib" path string into a store-relative path, with nixpkgs / "lib" as the minimal store root. Any paths occurring in string values that are forced may be coerced to store paths. If we teach that both path appends via / and path interpolations never coerce their arguments or results to store paths, then a user can simply strip the double quotes from this expression to be safe. Here, nixpkgs / "lib" is about as compact, but the extended path interpolation is more appealing for an original expression like "${nixpkgs}/pkgs/by-name/${shard}/${name}/package.nix", which would become nixpkgs / "pkgs/by-name/${shard}/${name}/package.nix" using path interpolation. As long as nixpkgs is always a path value, I'd prefer to use path interpolation, because syntax highlighting can make obvious that those are (safe) path interpolations. If string interpolation is used, even on the RHS of a path append, I have to do a double take on a path-y string to make sure it's a relative path that won't coerce to a store path.

Alternatively, lazy-trees changes its design to make coercing to store paths a much more intentional operation, so Nix programmers don't have to worry as much about strings containing paths.

See also https://github.com/NixOS/nix/issues/7335, which has this quote:

Eelco has said that the Nix language isn't intended for path processing. So path values are meant to be used directly by users.

To add on to that, I think that as long as NixOS continues to provide modules that can't have their functionality fully disabled as Nix source files, like the majority of nixpkgs/nixos/profiles/, NixOS will continue to encourage the pattern imports = [ (modulesPath / "profiles/qemu-guest.nix") ];. NixOS currently ships this as modulesPath + "profiles/qemu-guest.nix", which without lazy-trees at least is equivalent to "${modulesPath}/profiles/qemu-guest.nix". We can teach beginners to switch to modulesPath / "profiles/qemu-guest.nix" instead, but then we also have to emphasize that ${modulesPath}/profiles/qemu-guest.nix is not equivalent (and is possibly invalid if modulesPath is not a path object), even if it looks like it should be equivalent. No matter what nixos-generate-config generates for hardware-config.nix, a beginner should reasonably be able to know what using all the above forms entail while looking at a screen or two of docs.

Having written all that, I'd like path interpolation to be semantically equivalent to using / for path append, I'd like to deprecate + on non-store path objects, and I'd like to introduce builtins.toStorePath. We already have ++ as a separate operator from +. If we can teach that foo/bar is equivalent to ./foo/bar, that foo/bar/${baz} is equivalent to ./foo/bar / baz, that foo/${bar}/baz is equivalent to ./foo / bar / "baz", and that ${foo}/bar is equivalent to foo / "bar", I think that'd work out well, and would almost mirror how you can (almost?) translate string interpolation into a bunch of + operations. I think this would work well with foo/${bar} or foo / bar erroring with a type error if bar is a path object. Absolute paths shouldn't be appended to absolute paths. By deprecating & warning on + with a non-store path object, accidental coercions to store paths are avoided. When non-store paths do need to be appended to or interpolated into strings, builtins.toStorePath can be used to explicitly ensure paths are in the store, or otherwise builtins.toString can be explicitly used.

This proposal means path literals starting with path interpolation may not result in path values, but I think that's reasonable for the improved consistency. Path operations will be more likely to "do the right thing", even in the face of lazy trees.

roberth commented 5 months ago

is this an accurate typing?

Not sure about this one:

instance Div Path Path Path

The rhs path is only meaningful if it's an absolute path (ie not a virtual path into e.g. a git repo), and even then, constructing absolute paths to things that won't exist is not a great habit, perhaps subjectively.

I believe that path appends via path interpolations would also never directly coerce paths to store paths.

Sounds about right, and my attempts with path values and derivation outputs (for good measure) failed. :+1:

My point was that they're easily confused with string interpolations though, which do copy.

nixpkgs / "pkgs/by-name/${shard}/${name}/package.nix"

I have to do a double take on a path-y string to make sure it's a relative path that won't coerce to a store path.

I think / should fail if the rhs has a string context. Otherwise it's too easy to do an accidental IFD anyway.

Currently the error is only reported after adding to the store, which is not optimal, but sufficient for you not to have doubts about existing code you're reading.

A lazy path strings solution (lazy-trees or better) could postpone the copying operation until after the error is detected.

Eelco has said that the Nix language isn't intended for path processing.

I was surprised to read this, because the context wasn't presented clearly. This is about implementing operations on paths and particularly path strings within the Nix language. IIRC Eelco's point was that Nix is a DSL and we shouldn't need to implement libraries for such basic functionality in it.

NixOS currently ships this as modulesPath + "profiles/qemu-guest.nix", which without lazy-trees at least is equivalent to "${modulesPath}/profiles/qemu-guest.nix".

This is an unsolved problem. Lazy trees will leak non-deterministic virtual path names in this case, which in my view is the primary reason why we can't just adopt that solution. Possible solutions I'm aware of

Having written all that

I think I agree with your conclusions, but I'm unsure about the value to users of toStorePath. I guess it wouldn't hurt to have it for completeness, but isn't its use case already covered by builtins.path?

bb010g commented 5 months ago
instance Div Path Path Path

The rhs path is only meaningful if it's an absolute path (ie not a virtual path into e.g. a git repo), and even then, constructing absolute paths to things that won't exist is not a great habit, perhaps subjectively.

Agreed; this case shouldn't be legal.

I believe that path appends via path interpolations would also never directly coerce paths to store paths.

Sounds about right, and my attempts with path values and derivation outputs (for good measure) failed. 👍

My point was that they're easily confused with string interpolations though, which do copy.

If string interpolations & string concatenations never copy, by making them eagerly fail when a non-store path or non-store string context is used, the confusion disappears.

Eelco has said that the Nix language isn't intended for path processing.

I was surprised to read this, because the context wasn't presented clearly. This is about implementing operations on paths and particularly path strings within the Nix language. IIRC Eelco's point was that Nix is a DSL and we shouldn't need to implement libraries for such basic functionality in it.

My bad there; thank you for clarifying. I strongly agree that your regular set of path operations should be sane at the Nix level, without Nix-coded libraries.

  • make the module system perform the transformation instead. This should be possible to do entirely in Nix. (1. split into virtual root path and relative path string without context, 2. maintain a mapping from virtual root path to numbered placeholders, 3. insert those placeholders into the module paths, so that they're just strings)

    • O(n) unless something is added to the language (nondeterministicComparePathRoot function? nondeterministicPathRootIdString? A new "path map" type?)
    • not transparent
    • slows down the module system if it's complicated on the expression side

I feel like this is a footgun that'll cause problems the moment one of those paths leaves a module system evaluation and starts being manipulated by some consuming flake that isn't using flake-parts.

  • allow path values as attribute names

    • only depends on lazy path values
    • order is non-deterministic, unless we require a source fingerprint (or disallow a bunch functions??)

We're in the non-deterministic (or effectively random) boat already with allowing the sorting of lists with lazy path values inside them.

  • allow strings (with context) in attribute names

    • requires a proper lazy path strings implementation
    • same concern about order nondeterminism

Given that we already allow strings (with context) in lists that you can sort, that's probably the most consistent.

I think I agree with your conclusions, but I'm unsure about the value to users of toStorePath. I guess it wouldn't hurt to have it for completeness, but isn't its use case already covered by builtins.path?

If I have a forced myVirtualPath and force builtins.path { path = myVirtualPath; }, I'd expect that result to be the same as myVirtualPath, because I used builtins.path as the identity function specialized to path values. A theoretical builtins.path { path = myVirtualPath; strict = true; } would cover builtins.toStorePath's use case, where strict means we're strict about "adding" the path to the store. We could then say builtins.toStorePath = path: builtins.path { inherit path; strict = true; };, or builtins.strictPath = path: builtins.path { inherit path; strict = true; }; if we like the "strict" naming.

I want a function that solely converts an existing path into an actual store-relative path, because I want people to use it, especially if we emit deprecation warnings constantly when people don't use either builtins.strictPath or builtins.path with strict = true;. strict = true; doesn't have to eagerly write files into the store, but it does have to eagerly produce the store path those files would be located at, similar to the unbuilt store paths nix eval --raw can return for installables. I don't think people will tolerate having to write builtins.map (path: builtins.path { inherit path; strict = true; }) ps well every time they want to use some processed paths. builtins.map builtins.strictPath ps feels much better, and is something you can more easily catch when skimming.

It's a similar argument to allowing ${fooData}/share to evaluate to the same string as fooData / "share". You can see "${fooData}/share" used instead of fooData + "/share" all over NixOS today, even though they're the same number of characters. Interpolations are intuitive syntax. We want doing the correct thing to feel good.