This repository hosts the code for the PureScript Registry and its affiliated projects. If you are new to the registry and are interested in its development, then the following should be helpful:
If you are interested in the products of the registry, then you should see:
Finally, as always, package documentation is hosted on Pursuit.
Below, you can see the original RFC for a PureScript registry as it was written once the Bower registry shut down.
PureScript needs a way to distribute packages. We used to rely on the Bower registry for that, but this is not possible anymore.
Here's a non-comprehensive list of desiderable properties/goals that we'd generally like to achieve when talking of a PureScript registry, and that we think this design covers:
postinstall
and other similar hooks.This section has been informed by happenings, discussions, and various other readings on the internet, such as this one and this one.
Things we do not aim to achieve with this design:
Two main ideas here:
This repo will contain:
All of the above is about metadata, while the real data (i.e. the package tarballs) will live on various storage backends.
Manifest
A Manifest
stores all the metadata (e.g. package name, version, dependencies, etc)
for a specific version of a specific package.
Packages are expected to version in their sources a purs.json
file, that conforms
to the Manifest
schema to ensure forwards-compatibility
with future schemas.
This means that new clients will be able to read old schemas, but not vice-versa.
And the reason why forward (rather than backwards) compatibility is needed is because
package manifests are baked in the (immutable) package tarballs forever, which means that
any client (especially old ones) should always be able to read that manifest.
This means that the only changes allowed to the schema are:
For more info about the different kinds of schema compatibility, see here
All the pre-registry packages will be grandfathered in, see here for details.
You can find some examples of the Manifest
that has been generated for them in
the examples folder.
Note: the Location
schema includes support for packages that are
not published from the root of the repository, by supplying the (optional) subdir
field.
This means that a repository could potentially host several packages (commonly called a "monorepo").
The PureScript registry allows packages to specify a version for themselves and version ranges for their dependencies.
We use a restricted version of the SemVer spec which only allows versions with major, minor, and patch places (no build metadata or prerelease identifiers) and version ranges with the >=
and <
operators.
This decision keeps versions and version ranges easy to read, understand, and maintain over time.
Package versions always take the form X.Y.Z
, representing major, minor, and patch places. All three places must be natural numbers. For example, in a manifest file:
{
"name": "my-package",
"version": "1.0.1"
}
If a package uses all three places (ie. it begins with a non-zero number, such as 1.0.0
), then:
MAJOR
means values have been changed or removed, and represents a breaking change to the package.MINOR
means values have been added, but existing values are unchanged.PATCH
means the API is unchanged and there is no risk of breaking code.If a package only uses two places (ie. it begins with a zero, such as 0.1.0
), then:
MAJOR
is unused because it is zeroMINOR
means values have been changed or removed and represents a breaking change to the packagePATCH
means values may have been added, but existing values are unchangedIf a package uses only one place (ie. it begins with two zeros, such as 0.0.1
), then all changes are potentially breaking changes.
Version ranges are always of the form >=X.Y.Z <X.Y.Z
, where both versions must be valid and the first version must be less than the second version.
When comparing versions, the major place takes precedence, then the minor place, and then the patch place. For example:
1.0.0
is greater than 0.12.0
0.1.0
is greater than 0.0.12
0.0.1
is greater than 0.0.0
All dependencies must take this form. For example, in a manifest file:
{
"name": "my-package",
"license": "MIT",
"version": "1.0.1",
"dependencies": {
"aff": ">=1.0.0 <2.0.0",
"prelude": ">=2.1.5 <2.1.6",
"zmq": ">=0.1.0 <12.19.124"
}
}
The Registry should support various automated (i.e. no/little human intervention required) operations:
As package authors the only thing that we need to do in order to have the Registry upload our package is to tell it where to get it.
We can do that by opening an issue containing JSON that conforms to the
schema of an Addition
.
Note: this operation should be entirely automated by the package manager, and transparent to the user. I.e. package authors shouldn't need to be aware of the inner workings of the Registry in order to publish a package, and they should be able to tell to the package manager "publish this" and be given back either a confirmation of success or failure, or a place to follow updates about the fate of the publishing process.
Implementation detail: how do we "automatically open a GitHub issue" while at the same time not requiring a GitHub authentication token from the users? The idea is that if a package manager wants to avoid doing that then it's possible to generate a URL that the user can navigate to, so that they can preview the issue content before opening it. This is an example of such link.
Once the issue is open, the CI in this repo will:
Addition
, and continue running if soRepo
refers to, checking out the ref
specified in the Addition
,
and considering the package directory to be subdir
if specified, or the root of the repo if notMetadata
file:
The CI will post updates in the issue as it runs the various operations, and close the issue once all the above operations have completed correctly.
Once the issue has been closed the package can be considered published.
It is largely the same process as above, with the main difference being that the body of the created issue will
conform to the schema of an Update
.
Unpublishing a version for a package can be done by creating an issue containing JSON
conforming to the schema of an Unpublish
.
CI will verify that all the following conditions hold:
If these conditions hold, then CI will:
published
to unpublished
in the package Metadata
Unpublishing is allowed for security reasons (e.g. if some package was taken over maliciously),
but it's allowed only for a set period of time because of the leftpad
problem (i.e. breaking everyone's builds).
Exceptions to this rule are legal concerns (e.g. DMCA takedown requests) for which Trustees might have to remove packages at any time.
Every package will have its own file in the packages
folder of this repo.
You can see the schema of this file here, and the main reasons for this file to exist are to track:
As noted in the beginning, Package Sets are a first class citizen of this design.
This repo will be the single source of truth for the package-sets - you can find an example here - from which we'll generate various metadata files to be used by the package manager. Further details are yet to be defined.
While the upstream package sets will only contain packages from the Registry, it is common to have the need to create a custom package set that might contain with packages that are not present in the Registry.
In this case the format in which the extra-Registry packages will depend on what the client accepts.
One of such clients will be Spago, where we'll define an extra-Registry package as:
let Registry = https://raw.githubusercontent.com/purescript/registry/master/v1/Registry.dhall
let SpagoPkg =
< Repo : { repo : Registry.Location, ref : Text }
| Local : Registry.Prelude.Location.Type
>
..that is, an extra-Registry package in Spago could either point to a local path, or a remote repository.
Here's an example of a package set that is exactly like the upstream, except for the effect
package, that instead points to some repo from GitHub:
-- We parametrize the upstream package set and the Address type by the package type that our client accepts:
let upstream = https://raw.githubusercontent.com/purescript/registry/master/v1/sets/20200418.dhall SpagoPkg
let Address = Registry.Address SpagoPkg
let overrides =
{ effect = Address.External (SpagoPkg.Repo
{ ref = "v0.0.1"
, repo = Registry.Repo.GitHub
{ subdir = None Text
, githubOwner = "someauthor"
, githubRepo = "somerepo"
}
})
}
in { compiler = upstream.compiler, packages = upstream.packages // overrides }
The "Registry Trustees" mentioned all across this document are a group of trusted janitors that have write access to this repo.
Their main task will be that of eventually publish - under very specific conditions - new versions/revisions of packages that will need adjustments.
The reason why this is necessary (vs. only letting the authors publish new versions) is that for version-solving to work in package managers the Registry will need maintenance. This maintenance will ideally be done by package authors, but for a set of reasons authors sometimes become unresponsive.
And the reason why such maintenance needs to happen is because otherwise older versions of packages with bad bounds will still break things, even if newer versions have good bounds. Registries which don’t support revisions will instead support another kind of "mutation" called "yanking", which allows a maintainer to tell a solver not to consider a particular version any more when constructing build plans. You can find a great comparison between the two here illustrating the reason why we support revisions here.
Trustees will have to work under a set of constraints so that their activity will not cause disruption, unreproducible builds, etc. They will also strive to involve maintainers in this process as much as possible while being friendly, helpful, and respectful. In general, trustees aim to empower and educate maintainers about the tools at their disposal to better manage their own packages. Being a part of this curation process is entirely optional and can be opted-out from.
Trustees will try to contact the maintainer of a package for 4 weeks before publishing a new revision, except if the author has opted out from this process, in which case they won't do anything.
Trustees will not change the source of a package, but only its metadata in the Manifest
file.
Trustees are allowed to publish new revisions (i.e. versions that bump the pre-release
segment from SemVer), to:
Note: there is no API defined yet for this operation.
If you'd like to reuse a package name that has already been taken, you can open an issue in this repo, tagging the current owner (whose username you can find in the package's metadata file).
If no agreement with the current owner has not been found after 4 weeks, then Registry Trustees will address it.
For more details see the policy that NPM has for this, that we will follow when not otherwise specified.
I.e. the answer to the question:
How do I know which dependencies package X at version Y has?
Without an index of all the package manifests you'd have to fetch the right tarball and look at its purs.json
.
That might be a lot of work to do at scale, and there are usecases - e.g. for package-sets - where we need to lookup lots of manifests to build the dependency graph. So we'll store all the package manifests in a separate location yet to be defined (it's really an implementation detail and will most likely be just another repository, inspired by the same infrastructure for Rust).
As noted above, this repository will hold all the metadata for packages, but the actual data - i.e. package tarballs - will be stored somewhere else, and we call each of these locations a "storage backend".
Clients will need to be pointed at place they can store package tarballs from, so here we'll store a mapping between "name of the storage backend" to a function that given (1) a package name and (2) a package version then returns the URL where the tarball for that package version can be fetched.
We maintain the list of all the Storage Backends and the aforementioned mappings here.
We also provide a small utility to demonstrate how to use the mappings.
There can be more than one storage backend at any given time, and it's always possible to add more - in fact this can easily be done by:
A package manager should download a specific version of a package in the following way:
Note: we are ensuring that the package we download is the same file for all backends because we are storing the SHA256 for every tarball in a separate location from the storage backends (this repo).
It is paramount that we provide the smoothest migration path that we can achieve with the resources we have. This is because we feel the ecosystem is already close to maturity (at this point breaking changes happen very rarely in practice), and we don't want to unnecessarily mess up with everyone's workflow, especially if it's possible to avoid that with some planning.
So a big chunk of our work is going towards ensuring that Bower packages are gracefully grandfathered into the new system. This basically means that for each of them we will:
What has happened already:
What is happening right now:
Manifest
, which is the big blocker
for proceeding further, since it will be baked into all the tarballs uploaded to the storage.What will happen after this:
All the Registry CI is implemented in PureScript and runs on GitHub Actions.
Source can be found in the ci
folder, while the workflows folder
contains the various CI flows.
Yet to be defined: see this issue
As noted above, "The Registry" is really just:
Mirroring all of this to an alternative location would consist of:
Additionally we could keep some kind of "RSS feed" in this repo with all the notifications from package uploads, so other tools will be able to listen to these events and act on that information.
We have of course investigated other registries before rolling one.
Our main requirement is to have "dependency flattening": there should be only one version of every package installed for every build.
All the general-purpose registries (i.e. not very tied to a specific language) that we looked at do not seem to support this.
E.g. it would be possible for us to upload packages to NPM, but installing the packages from there would not work, because NPM might pull multiple versions of every package from there according to the needs of every package.
These are the main reasons why we prefer to handle this with git+CI, rather than deploying a separate service:
Install dhall, then:
$ cat "your-file.json" | json-to-dhall --records-loose --unions-strict "./YourDhallType.dhall"
This design is authored by @f-f, with suggestions and ideas from:
Create a .env
file based off .env.example
and fill in the values for the environment variables:
cp .env.example .env
If you are running scripts in the repository, such as the legacy registry import script, then you may wish to use the provided Nix shell to make sure you have all necessary dependencies available.
$ nix develop