zopsicle / cp6t

Chloé’s Perl 6 tooling — superseded!
https://github.com/chloekek/raku-nix
3 stars 0 forks source link

Introduce sets #16

Open zopsicle opened 5 years ago

zopsicle commented 5 years ago

A nice way to guarantee stability and compatibility is immutable distribution sets.

zopsicle commented 5 years ago

Alright, made an idea:

Requirements

* Typically the CI job that builds distribution sets, see below.

Prior work

Nixpkgs (not to be confused with Nix, which is the tool that Nixpkgs is built on top of) is a similar system that it used for all sorts of packages, typically programs for workstations and servers.

Stackage is a similar system used for Haskell libraries.

cp6t already uses Nixpkgs to get the tools necessary to build Rakudo, such as gcc and glibc, and has so far worked out great. Nix will be used for building distributions, since it satisfies the properties we want: tarballs with hashes, ability to override versions, global cache (e.g. https://cachix.org). In fact I already have some Nix code to build distributions: https://github.com/chloekek/cp6t/blob/master/perl6-on-nix/default.nix.

Sets

There will be the concept of sets. A set is a collection of distributions, each pinned at a specific version. There will be no more than one version of each distribution in the set. The set is also associated with a specific version of Rakudo.

Sets are append-only: once a set is created, new distributions can be added to it, but not deleted from it, and their versions cannot be changed.

New sets are created periodically, with newer versions of distributions.

Each set is stored in the repository as:

There will be a directory for each set in the repository.

Development of a new set occurs in an unstable directory. Once the set is deemed comprehensive and all tests pass, it is matured into a numbered set and released.

cp6t-propose-set

The cp6t-propose-set program will replace the current cp6t-ecosystem program, and it will work as follows:

  1. Create the database, empty.
  2. Retrieve a list of archives on CPAN.
  3. Retrieve p6c and use git ls-remote to find the commit hashes, then construct tarball URLs.
  4. For each archive:
    1. Store the URL of the archive in the database.
    2. Use nix-fetch-url to find the hash of the archive.
    3. Store the hash of the archive in the database.
    4. Read META6.json from the archive.
    5. Store metadata in the database.
    6. If there’s an earlier version of the distribution in the database, delete it.
  5. Sort the distributions topologically using dependency information.
  6. For each distribution:
    1. Generate a Nix expression.
    2. Build the Nix derivation.
    3. Upload the artifacts to the global cache.
    4. Run the tests.
    5. Store in the database that the distribution is successful.
  7. Generate a Nix expression for the entire set.

If at any point processing a distribution fails, store this in the database and continue with the next distribution.

The program cp6t-propose-set can be invoked either at step 1, or at any later step in which case it will use the existing database to continue.

CI will run cp6t-propose-set periodically and create pull requests on this repository. The pull request will include a detailed report of what went well and what went wrong.

Open questions

How to deal with p6c? It's a bit annoying since META.list contains so little information. But it's probably doable if we accept that we can use Git master of each package. :cold_sweat:

What format to use for the database? SQLite is annoying to use with Git. Perhaps JSON or a custom text-based format that can easily be merged.

Windows and macOS support? First of all Nix doesn't work on Windows. And I don't want to maintain these anyway since they're unfree operating systems and I don't want to install them since they cost money and contain malware. For now this will be Linux-only.

zopsicle commented 5 years ago

Development is going well regarding CPAN and p6c: I now have two subroutines with the same interface, one giving a seq of CPAN tarball URLs and one giving a seq of p6c tarball URLs.

These latter works by using git ls-remote on each Git repository mentioned in source-url in projects.json, and taking the commit hash for HEAD, then constructing a GitHub tarball URL. It doesn't yet work for Bitbucket and GitLab but that is easy to add.

zopsicle commented 5 years ago

Now that I can retrieve tarball URLs, I can start working on use nix-prefetch-url to download the tarballs and compute their hashes. Then we can create a file that maps tarball URLs to their hashes, and be certain that later downloads will result in the exact same file. :)

zopsicle commented 5 years ago

I now have two files, one with a list of archive URLs, and one with a hash for each archive URL.

zopsicle commented 5 years ago

I now have a file that contains META data from most of the distributions.