Native module system - Githubissues

thufschmitt commented 11 months ago

Is your feature request related to a problem? Please describe.

The merging system of Nickel is awesome. It doesn't natively solve the exact use-case of NixOS modules because there's one thing that's missing: closed record contracts (only accepting the fields that are defined) that can be extended through merging, and that are only resolved at the very end (I remember some old discussion with @yannham where the system worked a bit like that, but the semantics changed in the meantime – globally for the better, and it doesn't anymore).

An example of what can be done with NixOS modules but doesn't contract-check in Nickel atm:

({
  bar,
  foo = bar + 1
} | { foo | Number }) &
({
  bar = 2,
} | { bar | Number })

This is, however, incredibly useful when writing complex configurations as it allows them to be extended in a transparent way.

Describe the solution you'd like

Some nice way of expressing the above. I'm not sure what would be a good solution, I left some suggestions below

Describe alternatives you've considered

Encoded module system. I have a fun PoC here. That kinda works, but with the costs of an encoding (in particular, it can't work at all with the lsp, which makes it very sad).
Don't use closed contracts, but blindly accept everything in the records. Something like
```
{
  bar,
  foo | Number = bar + 1
} &
{
  bar | Number = 2,
}
```
That works and is simple, but since it relies on open record fields it will silently accept a mistyped field name.
Use open contracts like above, but find a way to filter out invalid fields after the fact. Maybe with some builtin that would tell which fields don't have a contract attached to error out on them
A different merging operation that merges the contracts and delays the contract checking. Feels very ad-hoc and error-prone a priori, but who knows

Additional context Add any other context or screenshots about the feature request here.

yannham commented 11 months ago

I would like to make sure that I understand the semantics precisely. In your example, you already know that you need a bar field to be defined somewhere, that's why you have to write a bar without definition in the first operand of the merge. Thus, would the following be a solution?

({
  bar,
  foo = bar + 1
} | { bar, foo | Number }) &
({
  bar = 2,
} | { bar | Number })

Or even:

({
  bar,
  foo = bar + 1
} | { bar | Number, foo | Number }) &
({
  bar = 2,
})

It sounds like bar is part of the interface of the first module, and that you statically know that. You're filling an input. That's why I'm not sure to get the "can be extended through merging" part. By the way, are those contracts enforced at a record field, rather than on free-standing expression, by any chance, as in:

{
  some_module | {foo | Number} = {
    bar,
    foo = bar + 1,
  }
}
& {
  some_module | {bar | Number} = {
    bar = 2,
  }
}

In which case the situation is different, because then the contract annotation propagate to all subsequent merged values (while it doesn't for free-standing contract annotations, which don't have any notion or knowledge about merging).

Otherwise, you might want to extend your record freely with new values, but then I'm not sure to get what is the criterion to distinguish between a "good extension" and a "bad accidental extension" that would differentiate this idea from just having an open record contract.

thufschmitt commented 11 months ago

To give you something closer to the “X“ of my XY problem, here's (very simplified) what we have right now:

let builders = {
  Derivation = {
    name | String,
    buildCommand = {
      executable | String,
      args | Array String,
    },
    env | { _ : String } = {},
  },
  ShellBuilder =
    Derivation
    & {
      buildScript | String,
      buildCommand.executable = "bash",
      buildCommand.args = ["-c", buildScript],
    },
}
in
{
  name = "foo",
  buildScript = "echo $MESSAGE > $out",
  env.MESSAGE = "Hello",
}
& builders.ShellBuilder

This is pretty nice (though missing completion because of #1499), but the issue is that there's nothing to check that my final derivation will only use meaningful fields. I can write enw.MESSAGE = "Hello", and the evaluator will be happy with that (until the builder complains that $HELLO isn't defined). Likewise, I could (and I did the first time I wrote that snippet 😬) mistype buildCommand.args = ["-c", buildScript] as args = ["-c", buildScript] and get a confusing error about args not being defined whithin buildCommand rather than something telling me that I try to define a non-existent field.

With my module system prototype, I can rewrite the above as:

let modules = import "modules.ncl" in
let builders = {
  Derivation =
    {
      env = {}
    }
      | modules.Module
        {
          name | String,
          buildCommand
            | {
              executable | String,
              args | Array String,
            }
              env
            | { _ : String }
        },
  ShellBuilder =
    Derivation
    & {
      buildCommand.executable = "bash",
      buildCommand.args = ["-c", buildScript],
    }
      | modules.Module { buildScript | String },
}
in
modules.fix
  (
    {
      name = "foo",
      buildScript = "echo $MESSAGE > $out",
      env.MESSAGE = "Hello",
    }
    & builders.ShellBuilder
  )

Which is more verbose, but has the advantage of actually being able to complain if I type enw.MESSAGE instead of env.MESSAGE or args instead of buildCommand.args.

Maybe there's already a “native” way of doing what I want, but I couldn't figure out what it is.

(I realise my example was confusing. I generally don't care that much about reusing values defined somewhere else, but I care a lot about defining values declared somewhere else)

thufschmitt commented 11 months ago

I just realize that I could partially fix that by using a contract application rather than merging when defining my derivation at the end ({...} | builders.ShellBuilder). That would ~probably~ work, but only for the toplevel thing (I still wouldn't get any checking from for the definition of ShellBuilder), and it feels like abusing contracts a bit (ShellBuilder isn't really a contract).

EDIT: The probably doesn't have its place here. I did test it, and it does work.

yannham commented 11 months ago

It seems that what you want, in the end, is to apply a contract to the fragment:

{
  name = "foo",
  buildScript = "echo $MESSAGE > $out",
  env.MESSAGE = "Hello",
}

One solution, which I think is somehow what you do in NixOS modules, and what @vkleen does in tf-ncl, is to properly separate the interface of your module (whatever it means), that is the inputs, from the rest. One possible encoding, for example, could be to write:

ShellBuilder =
  inputs | Derivation,
  outputs = {...}
...
{
  inputs = {
    name = "foo",
    buildScript = "echo $MESSAGE > $out",
    env.MESSAGE = "Hello",
  }
} & ShellBuilder

I used inputs and outputs, but you could have different conventions. The outputs could lie directly at the root, or the inputs (but this wouldn't solve your issue then). Composition and namespacing (if we merge several modules, all their inputs are merged - is this fine, or should we namespace and separate those inputs?) isn't totally trivial though.

In any case, in this example, because you're now merging inputs fields, the contract Derivation will propagate. Another variation is to introduce an additional field just to leverage, once again, the capability of fields to propagate their attached contract through merging (which you can't in your example, because you're merging top-level records):

{
  derivation = {
    name = "foo",
    buildScript = "echo $MESSAGE > $out",
    env.MESSAGE = "Hello",
  }
} & {
  derivation | Derivation = ShellBuilder,
}

(or something like that - whatever, the idea is that adding an indirection level derivation allows to attach a contract to it somewhere, and that contract will propagate through merging).

In the inputs/outputs example, I intentionally left the outputs unfilled. I must say I'm not sure what are the outputs of ShellBuilder. It does look like it could actually be a contract, because it's merely fixing some values of the derivation. Also, you define ShellBuilder as Derivation & {...}, so I would expect that {...} and ShellBuilder to intuitively have the hypothetical type Contract. Otherwise, shouldn't it be defined as shell_builder | Derivation = {...}, if it's a proper record?

Maybe it could be an interesting exercise to see if you can define, for the different entities you have in nickel-nix (contracts, builders, derivations, environment definitions), a meaningful notion of contracts/inputs/outputs. I feel like the simple everything is a record + recursive merging in Nickel has a tendency to make us mix everything, leading maybe to some confusion.

Everything that I wrote are mostly ideas or leads to follow, but I must admit we haven't figured out the whole story around "modules" yet. For now, we've stuck to not having a first-class notion, trying to get away with simple records, contracts and merging. But if this isn't sufficient, this will end up the same as in Nix: people having different encoding and implementation of a module system on top of core Nickel, and that's exactly what we want to avoid. In Eelco's gist on what's wrong about the Nix language, he says

It's worth noting that the Nix language is intended as a DSL for package and configuration management, but it has no notions of "packages" or "configurations".

The same could be said about Nickel: it's a DSL for configuration management, but it doesn't have a first-class notion of configuration (which would be modules, I think). If we can get away with not having more abstractions in Nickel, because there's a simple, efficient and canonical way to encode them in the current model, that's a huge win. But if not, we might consider having a proper notion of modules, interfaces and composition/extensions.

Radvendii commented 11 months ago

The main thing I got from our discussion today is that you can accomplish something like what we want with the following (using "config" and "options" by analogy to NixOS options):

let module = {
  options = {},
  config | options = {},
} in
let base = module & {
  options = {
    foo | Number
  },
  config = {
    foo
  }
} in
let ext= base & {
  options = {
    bar | Number
  },
  config = {
    foo = bar + 1
  }
} in
ext & {
  bar = 2,
}

This gives you the right contract applications in the right places, but...

problems with this approach include:

The LSP currently won't give you completion on field entries (this could maybe be solved down the line, but it's tricky because it requires evaluation of a bunch of merges)
It requires a config sub-record, rather than being able to set options at the top level. (NixOS Modules also have this, but they hack in support for top-level declarations)

I was also thinking about it afterward and it strikes me that if we override foo in the example above, we shouldn't actually need to set bar, but with this mechanism we do. This would look like

mod2 & {
  foo = 5
}

An example in nix package terms would be that for stdenv.mkDerivation, if you set name you don't have to set pname and version. Maybe we can get this behaviour by setting bar | optional and foo | default = bar + 1?

yannham commented 11 months ago

Hmm, it's true than in general, we don't have an easy way of expressing "at least one of those codependent fields must be set". But this is probably a bit more general that's the current issue, you might want to model that even in a standalone record contracts that doesn't have anything to do with modules.

thufschmitt commented 11 months ago

We discussed this yesterday with @yannham, @vkleen, @jneem and @Radvendii .

Long-story short: There's no real first-class support, but we can have a cheap-ish encoding with something very similar to NixOS modules:

let Module = {
  Options = {},
  config | Options = {},
}
in

let builders = {
  Derivation = {
    Options = {
      name | String,
      buildCommand = {
        executable | String,
        args | Array String,
      },
      env | { _ : String } = {},
    }
  } | Module,
  ShellBuilder =
    Derivation
    & {
      Options.buildScript | String,
      config = {
        buildScript,
        buildCommand.executable = "bash",
        buildCommand.args = ["-c", buildScript],
      }
    },
}
in
(
  {
    config.name = "foo",
    config.buildScript = "echo foo > $out"
  }
  & builders.ShellBuilder
).config

That doesn't play nicely with the LSP at the moment (for the completion at least), but there's a way forward (since it doesn't have to rely on custom contracts, just plain merging).

Maybe there could be a more integrated way in the future, but it's not clear yet what it would look like.

piegamesde commented 10 months ago

Just skimmed the discussion, but I'm a bit worried that an attempt to 1:1 port NixOS modules to Nickel will also reproduce their weaknesses. Nickel should be an opportunity to take a step back and find out which module system equivalent works the best. For example, the fact that one does not always need a config top-level attribute is a hack IMO and I'd like if we could get rid of it.

Similarly, a lack of auto completion should not necessarily be a reason to change the language, if the LSP could be improved instead.

yannham commented 10 months ago

I wholeheartedly agree with @piegamesde. I believe the goal is really not to port the NixOS module system, but to figure out a module system - or to figure out if we even need one - that should make NixOS-style modules reasonably simple to write. But not necessarily a 1:1 mapping, as it would be acceptable to abandon some Nix idiosyncrasies if it makes for a simpler and more universal system.

I think it just happens that the need for modules is first arising in the work on Nickel-Nix, so it's a good driving use-case. But modules should be useful and adapted to Terraform, Kubernetes, OS configuration, etc. In general any Nickel language feature, while it can be motivated by the Nix use-case, should be universal, and not just a port of what exists in Nix (that's what we tried to do with symbolic strings, for example).

tweag / nickel

Native module system #1505