NixOS / nix

Nix, the purely functional package manager
https://nixos.org/
GNU Lesser General Public License v2.1
12.15k stars 1.47k forks source link

Built-in YAML support #4910

Open mweinelt opened 3 years ago

mweinelt commented 3 years ago

Is your feature request related to a problem? Please describe. Currently in nixpkgs YAML is often rendered as JSON and pkgs.formats.yaml denotes

# YAML has been a strict superset of JSON since 1.2

which glances over the fact that JSON is only a subset of YAML.

Specifically we have no proper way to denote builtin and user-defined types, which are unquoted strings prefixed with two or one exclamation mark.

Describe the solution you'd like A more complete mapping from Nix to YAML.

Describe alternatives you've considered Rendering an attrset to JSON, then using remarshals json2yaml to convert it to YAML and then apply a regular expression to unquote strings that start with an exclamation mark.

Home-assistant uses these to substitute secrets and do other includes at runtime.

{ pkgs, ... }:
let
  configFile = pkgs.runCommand "configuration.yaml" { preferLocalBuild = true; } ''
    ${pkgs.remarshal}/bin/json2yaml -i ${configJSON} -o $out
    # Hack to support custom yaml objects,
    # i.e. secrets: https://www.home-assistant.io/docs/configuration/secrets/
    sed -i -e "s/'\!\([a-z_]\+\) \(.*\)'/\!\1 \2/;s/^\!\!/\!/;" $out
  '';
in
{}

Additional context https://en.wikipedia.org/wiki/YAML#Advanced_components

roberth commented 3 years ago

You would have to represent these advanced components in the Nix language somehow; perhaps:

{ type = "yaml-advanced-component"; value = "..."; }

With such a representation, nothing should stop you from converting that to proper YAML in a derivation.

There's some concern about keeping Nix itself easy to bootstrap, so adding a yaml library dependency comes at a cost.

For TOML we only have a parsing function, so we don't need IFD to read those files into Nix values. The same might apply for YAML, but that's not the use case you're describing.

stale[bot] commented 2 years ago

I marked this as stale due to inactivity. → More info

mweinelt commented 2 years ago

I still care about this feature.

DavHau commented 2 years ago

I would be very happy to have a builtins.fromYAML. YAML is commonly used in other package ecosystems to express package metadata. This metadata could be converted to derivations via nix without IFD or any extra steps if nix had a parsing function. To give some examples:

ners commented 2 years ago

I'm happy to take ownership of this. I'll get started on Wednesday next week.

DavHau commented 2 years ago

How is the status on this @ners ? We just had another use case popping up at https://github.com/nix-community/dream2nix/issues/234

roberth commented 2 years ago

Adding a YAML parser effectively makes that specific YAML parser behavior part of the Nix language, because the Nix language should be reproducible.

Parsing YAML through IFD is actually a good thing for reproducibility, because it pins the implementation of the YAML parser to a specific version, preventing any repro bugs that will be caused by fixes in the YAML parser, switching to a different YAML library, YAML major version upgrades, etc.

DavHau commented 2 years ago

Another approach to solve the logic pinning issue could be: https://github.com/NixOS/nix/issues/1491

I think IFD is not a good option for parsing. See https://github.com/NixOS/nix/issues/1491#issuecomment-318440757

Aside from the mentioned issues, the problem with IFD is that it significantly impacts UX in a bad way. For example, if the package name is part of the yaml, and therefore read through IFD, simply listing your packages will become an expensive operation. If a nix expression depends on IFD, it also looses the capability to be evaluated on arbitrary systems which destroys a lot of value of nix. Therefore nix flake show will crash by default if IFD is used.

roberth commented 2 years ago

:eyes: I wouldn't think of Nix as a project that prefers to sacrifice correctness and reliability for a little performance improvement.

For example, if the package name is part of the yaml, and therefore read through IFD, simply listing your packages will become an expensive operation.

Not taking the package name from IFD doesn't seem too hard. Of course the outputs will rely on it, but as long as the set of output names is also known statically, only the output attributes of the package need to be strict in the IFD. Assuming that the other attributes aren't taken from the IFD either, the package attrset can be returned without IFD, and all non-output attrs can be evaluated without IFD. Finally there's RFC92 outputOf which can make even the output attrs free of IFD.

If a nix expression depends on IFD, it also looses the capability to be evaluated on arbitrary systems which destroys a lot of value of nix.

It does complicate things a bit. Usually this is not a problem at all. When it is, remote builds can be used to solve it. If we don't consider that to be sufficient, we could add explicit cross-evaluation support to Nixpkgs. Similar to buildPlatform, we'd have evalPlatform (defaulting to buildPlatform), which produces pkgs.evalPackages, which is a Nixpkgs non-cross package set for evalPlatform.

Note that this is only for building. Listing packages and package metadata can be done without IFD.

roberth commented 2 years ago

I suppose a middle ground is to require the builtins.fromYAML calls to specify an implementation name and version number instead of a proper pin, so that future versions can be bug-for-bug compatible. (Sounds bad, right?, but that's exactly what we need)

DavHau commented 2 years ago

My interest comes from building compatibility layers between nix and existing package managers. Reading other serialization formats like JSON, TOML, YAML is crucial for this as other package managers store data usually in these formats. YAML not being supported is a pain point.

I have tried IFD, and I made the experiences that it complicates things a lot.

Sure, one can try to avoid reading the package name via IFD and instead use some regex. Doing the same with package metadata becomes a lot harder already.

Also, requiring users to setup remote builders in order to evaluate nix expressions, I think, is not the kind of UX we should aim for.

we could add explicit cross-evaluation support to Nixpkgs. Similar to buildPlatform, we'd have evalPlatform (defaulting to buildPlatform), which produces pkgs.evalPackages, which is a Nixpkgs non-cross package set for evalPlatform.

Sounds interesting. Maybe I misunderstand, but does this means that there would be a evalPlatforms parameter to my nix expression with the expectation that the value of evalPlatform doesn't impact the evaluation result? This seems based on the assumption that the same tools executed on different evalPlatforms yield the same result. For a yaml parser this might be quite likely, but not for other tools. The mach-nix extractor for example is an IFD based function to extract metadata from a setup.py file using setuptools. Executing it on different platforms will lead to a different set of dependencies. Therefore, integrating a concept of evalPlatforms sounds potentially dangerous and would break reproducible evaluation.

I suppose a middle ground is to require the builtins.fromYAML calls to specify an implementation name and version number instead of a proper pin, so that future versions can be bug-for-bug compatible. (Sounds bad, right?, but that's exactly what we need)

I like that idea. There might be an overhead of maintaining more than one implementation at the same time, but old versions could be deprecated after a while.

roberth commented 2 years ago

YAML is crucial

I don't want to diminish the importance of the ability to process YAML documents using the evaluator. I want us to make an informed decision and hopefully not sacrifice other important features like reproducibility.

old versions could be deprecated after a while.

Being able to build age-old packages on a current Nix installation is not something I think we should sacrifice after all these years.

evalPlatforms [...] potentially dangerous

Just like with cross, the builder is responsible for creating an output that is appropriate for hostPlatform. So yes, just like you can have package expressions that don't work for cross, you could have IFD that only works for same-platform evaluation.

No more dangerous than cross. I don't know. Maybe evalPlatform is over-engineering it, because most users will only eval for their own platform anyway. Evaluating for another platform without the intent to build for it is a little odd. Not completely useless, but not a necessity.

I have tried IFD, and I made the experiences that it complicates things a lot.

RFC 92 implementation has not been completed yet, so this may be a premature judgement.

Sure, one can try to avoid reading the package name via IFD and instead use some regex. Doing the same with package metadata becomes a lot harder already.

I agree regexes are problematic. What else do you need for nix flake show, besides the name of the package? Most of the actual meta metadata should be specified in a Nix expression anyway. Any computation that is not required for the package name and list of output names can be deferred with outputsOf.

DavHau commented 2 years ago

Most of the actual meta metadata should be specified in a Nix expression anyway.

Some automatic nix packaging approaches benefit from not having to write (or generate) nix expressions. This on-the-fly translation increases eval complexity and might not be the first choice for repos like nixpkgs, but it allows using nix on existing software projects without intermediary steps, which, for example, is valuable for side-by-side usage with other package managers where otherwise, two sets of package expressions would have to be maintained. It also allows to use nix as a standard interface to inspect arbitrary packages of other ecosystems.

What else do you need for nix flake show, besides the name of the package?

nix flakes show only requires the package name AFAICT. Though there are other potential use cases, like inspecting the package license or homepage, dependencies, etc. These things can be of interest, even without the intention to build the package. Alternatively this information could be provided in other ways, too. One could provide a separate tool for inspection. It probably just becomes less integrated with native nix tooling and less re-usable inside other nix expressions. For everything else, I agree, it can be deferred via IFD/rfc92.

wmertens commented 2 years ago

Actually, @arcnmx implemented a subset of YAML here: https://github.com/arcnmx/nixexprs/blob/f90f7c0a758f2142a448b9e59cc5d6f768b9275b/lib/from-yaml.nix

Maybe it will prove sufficient to parse those yaml lock files.

domenkozar commented 1 year ago

If there's one thing we learned from @grahamc talk at nixcon, it would be to implement builtins.fromYAML to bridge the gap between what the industry uses and what can be fed into Nix.

NaN-git commented 1 year ago

I suppose a middle ground is to require the builtins.fromYAML calls to specify an implementation name and version number instead of a proper pin, so that future versions can be bug-for-bug compatible. (Sounds bad, right?, but that's exactly what we need)

I would be happy, if this could be discussed in #7340. At the moment I just added builtins.fromYAML without any versioning.

FlafyDev commented 8 months ago

This will be useful for Dart / Flutter packaging in Nixpkgs. Currently we either IFD or convert the pubspec.lock file from yaml to json and use lib.importJSON.