tweag / nickel

Better configuration for less
https://nickel-lang.org/
MIT License
2.44k stars 93 forks source link

Package management #1585

Open yannham opened 1 year ago

yannham commented 1 year ago

Nickel package management

A package manager (PM) is an indispensable part of the tooling of any programming language out there. By PM, we mean a way to distribute and depend on external Nickel code (aka libraries).

This need is already present in e.g. nickel-lang/organist where they need to distribute a common library (the current solution is a mix of relying on a Nix flake and a home-made lock mechanism, cf https://github.com/nickel-lang/organist/pull/63). The JSON schema to Nickel converter needs to distribute a base library as well, and currently has to inline this library in each generated contract to avoid bothering users with an additional setup (https://github.com/nickel-lang/json-schema-to-nickel/pull/29).

We also anticipate that many users will write or generate comprehensive contract suites for particular use-cases (Kubernetes, GitHub actions, etc.). Those contract suites need to be distributed to other users to be useful: once again, this requires a package manager of some sort.

Expected outcome

A working package manager is a basic, indispensable tool for Nickel adoption to continue growing.

### Tasks
- [ ] https://github.com/tweag/nickel/issues/1643
- [ ] https://github.com/tweag/nickel/issues/1644
- [ ] https://github.com/tweag/nickel/issues/1645

Technical aspects

There are several aspects to such a package manager:

Proposal

While not set in stone, the current plan of the Nickel team is to avoid re-implementing full package management from scratch. This is done for each and every new language out there, while a lot of the features are language-agnostic (lockfiles, version selection, dependency resolution, project description, etc.). Especially for an interpreted language, which doesn't require the PM to be aware of a build system.

Our inclination is to follow the discussion of #329:

We propose to write small extensions to make Nickel understand the lockfile (or the equivalent notion) of several PMs, so that it can map names to local directories. All the package management part (installing, updating, etc.) would be handled by the external PM, although for the blessed one, we would provide a specific wrapper for it. Users would then simply write something like (the syntax is not part of proposal, just for the sake of this example):

let pred_lib = import <json-schema-to-nickel:predicates> from nix in
# or
let pred_lib = import <json-schema-to-nickel:predicates> from npm in

Mixed PMs

Things can get hairy if you need to depend on libraries from various PMs. It's probably not an issue (although it's a bit heavy) for a private project, but for libraries, it's more difficult. As a first step, we would require that a library distributed with PM XXX must ensure that all of its transitive dependencies use XXX as well.

Default PM

How to select the default, blessed PM? Here is a list of criterion:

Related:

aspiwack commented 1 year ago

An alternative that doesn't seem to be discussed here is to have a Nickel package repository, where packages are described in a sufficiently high-level format so that they can be translated to packages for a variety of package managers. What are the downsides to that approach?

yannham commented 1 year ago

It's an interesting middle ground. It would require a bit more design on our part (describing the high-level format and designin the "compilation" scheme from this format to each PM). It's possible that such a format is not even feasible (maybe mainstream package managers have different, incompatible version resolution policies given the same bounds? but I imagine we can always settle on a generic policy and encode it precisely when we "compile" the generic package to a specific PM). Funnily, while Nix needs xxx2nix tools, we would need nickel2xxx ones.

On the bright side, my main concern with the current approach is fragmentation for generic libraries that don't pertain to one specific ecosystem (I use cargo, but this generic-nickel-utils library is packaged with NPM). Having a central repo for those that can be used from mainstream PMs would solve the problem.

jneem commented 6 months ago

I had a stab at a prototype in #1903. As discussed in the original comment, it's in two parts.

The part that lives in nickel-lang-core contains just the package-to-name mapping and the syntax for importing from packages. I haven't yet addressed the question of describing a library and its exports -- the current prototype is just file-based: you write import "foo/bar.ncl"@my-package and it finds the file whose path is foo/bar.ncl from the root of my-package.

I was planning to use npm for the version resolution/package distribution part, but I got scared away by their ToS, which says that it's only for distributing things that are "compatible" with npm. So I wrote a small package fetcher and lockfile builder that only supports path and git dependencies.

In building this, I ran into a bunch of questions.

Interface questions

Version resolution question

What do we want to happen when multiple versions of a dependency are required (possibly coming from transitive dependencies)? There are at least three possibilities:

  1. Take all the requested versions.
  2. Merge "compatible" dependencies to a single version, and allow multiple incompatible versions.
  3. Insist that we find a single version that's compatible with everything.

I think alternative 3 is the most common (pip, npm, and yarn do it at least), but I also think many people would agree that it's annoying. Alternative 2 is what cargo does. I don't know anyone that does alternative 1; it would be super annoying in a nominally-typed language, but it might be feasible in nickel? It could lead to excessively duplicated dependencies, though.

These are not relevant for the current prototype (which doesn't do version resolution), but could have an impact on the choice of external package manager.

Package interfaces

Package-writers need some way to decide what is private and what is public, or else they'll have trouble making backwards-compatible updates. The current prototype (which allows importing arbitrary files from a package) is probably not the right behavior. I think I might change this to just bless lib.ncl as the entry point, as suggested above.

Package repository

If we don't find an external package manager to use, @ErinvanderVeen pointed out that clean has a reasonably lightweight approach: there's a central registry that stores an index but no packages. Each package must have its own gitlab repo, and the central repository just indexes them.

aspiwack commented 6 months ago

If we don't find an external package manager to use, @ErinvanderVeen pointed out that clean has a reasonably lightweight approach: there's a central registry that stores an index but no packages. Each package must have its own gitlab repo, and the central repository just indexes them.

So does Ocaml https://github.com/ocaml/opam-repository

And, for that matter, so does Nixpks (though there's a bit more information there).

But as soon as you go there, you're asking questions about which package versions to make available, where to collect them etc…

I guess, though, that the answer can be: the latest until otherwise specified, and the lockfile gives you enough information to reconstruct the place. Though if you have many dependencies it's a bit of a hell to manage. In which case you probably want more something like Nixpkgs/Stackage where you specify what I call a slice: a named reference to a single version of all the packages. And then you only need to update that one reference.

But if you do “latest version unless otherwise specified” you end up with difficult questions about transitive dependencies. Mmm :thinking: . Maybe the MVP doesn't need transitive dependency management? (though it's honestly not the best experience, says my experience with Latex on Nixpkgs).

jneem commented 6 months ago

But as soon as you go there, you're asking questions about which package versions to make available, where to collect them etc…

I was thinking:

Regarding version resolution, it might not be too painful to do it properly, thanks to pubgrub, which is used by python's uv package manager and possibly by future versions of cargo. The current prototype already allows transitive dependencies to refer to different versions of the same package.

aspiwack commented 6 months ago

or were you thinking about version resolution among the versions in the index?)

Yes, I was thinking of version resolution. It's good that a library for a solver exists. But really, this is something you want to put real thought in. The problem with version resolution is not to find a solution to a bunch of constraint: it's maintaining these constraints, updating the dependencies, and other such UX concerns.