Deploy registry server - Githubissues

thomashoneyman commented 1 year ago

The registry needs a server for a few reasons:

Some registry operations require more computational power / minutes than we have available in GitHub Actions (ie. the build matrix)
Some registry operations should have a lock, where multiple of the same operation doesn't run at the same time (ie. package set updates)
It is not possible to publish releases of your package in CI if the only way to publish packages is via creating GitHub issues

We've agreed to create one, and #576 is a first step towards that. This issue tracks actually deploying a NixOS server to Digital Ocean which:

Supplies an HTTP API for operations that do not require GitHub authentication (/publish, /unpublish, /transfer)
Supplies a minimal frontend for registry.purescript.org with at least a home page with some documentation and a DMCA takedown request page for legal compliance
(Optional) Supplies an endpoint for polling the status of an operation (ie. when you publish you get an id, and you can poll /status/<id> to see "pending" or the error / success result)
(Optional) Supplies a way to view the logs for an operation (ie. registry.purescript.org/logs/<id>).

We will also want to use some sort of health-check service so that the packaging team can be notified if something goes wrong with the server.

JordanMartinez commented 1 year ago

I'm curious. Any interest in using an HTTP2 server here? (i.e. https://github.com/purescript-node/purescript-node-http/pull/45)

thomashoneyman commented 1 year ago

I'm curious. Any interest in using an HTTP2 server here? (i.e. purescript-node/purescript-node-http#45)

I think the http2 bindings are too low-level for our needs; I'd expect to be able to use it via a higher-level library like payload, httpure, or something along those lines. For now I suggest we use httpure as seen in #576, as it's simple, maintained, and used in production by at least myself and CitizenNet, but I'm open to other options so long as we can stay reasonably high-level.

f-f commented 1 year ago

I'd vouch for httpurple, it has a few more features over httpure, e.g. it supports routing-duplex and Node middlewares.

f-f commented 1 year ago

For healthchecks we could use https://healthchecks.io, it's free and dead easy to use.

f-f commented 1 year ago

Re the optional points 3 and 4 above, I'd say they are not really optional? Having a server makes sense if we can feature-match the current pipeline, so having visibility on the logs is fairly necessary to being able to start using this.

For Spago usage it would be of great help if the logs/<id> endpoint had a since=<timestamp> parameter, so we could poll it to get the logs in real-time-ish. (this would mean of course returning a timestamp together with every log line)

thomashoneyman commented 1 year ago

For Spago usage it would be of great help if the logs/ endpoint had a since= parameter, so we could poll it to get the logs in real-time-ish. (this would mean of course returning a timestamp together with every log line)

I was thinking that we'd use something like webhooks to push notify events and error logs to Spago. Would you prefer to poll for logs instead? If so, do you want a verbosity=<log-verbosity> parameter as well? The 'debug' logs can be quite large.

In addition to or instead of a public 'logs' endpoint I was thinking we'd have a /logs/ page where the logs can be viewed. For example, someone having an issue with their upload could tell the chat the id of their job and someone else can go look at it to help diagnose the issue. We'd hide these from crawlers. But of course the endpoint and page can both exist.

thomashoneyman commented 1 year ago

Supplies an HTTP API for operations that do not require GitHub authentication (/publish, /unpublish, /transfer)

I spent some more time thinking about this. I think it would be useful for the HTTP API to be usable for authenticated actions such as adjusting the package sets, or acting as a registry trustee to transfer or unpublish a package or bump bounds by adding a new trustee revision.

We wanted to use GitHub for authentication so we don't have to roll anything ourselves, and we wanted to have trustees take actions via GitHub issues so that the information is public — everyone can see what they have done. But if we only allow the computation to stay on GitHub then a) we can't use the much beefier registry server and b) the logs are on GitHub only and are deleted after ~30 days.

Instead, I'd like to propose that we continue using GitHub issues as the interface to the server for Trustee actions, but that the server processes the actual events.

To authenticate, we can rely on the fact that the server and GitHub CI both need access to the pacchettibotti token in order to commit as pacchettibotti. Our CI can therefore send an API request with an authorization header using this token, and the server only processes the request if the token matches. CI will only send the API request if the issue opener is a member of the packaging team, the same as is done today.

Alternately, we may wish to have two separate tokens once CI no longer needs to commit as pacchettibotti (it will only need read scopes). In that case, we can still send this token to the server, and the server can use the GitHub API to look up what user account owns the token, and then verify that user account is pacchettibotti (or is a member of the packaging team). This lets us expire / cycle the token on GitHub or the server without having to keep them in sync.

This is a very restricted use of a single GitHub token that we already must have present, so I don't think this is expanding our risk in any significant manner so long as we are careful not to log the token. Arbitrary users sending authenticated actions still must go through the SSH signing process.

f-f commented 1 year ago

I was thinking that we'd use something like webhooks to push notify events and error logs to Spago. Would you prefer to poll for logs instead? If so, do you want a verbosity=<log-verbosity> parameter as well? The 'debug' logs can be quite large.

You probably meant websockets rather than webhooks (which would require every Spago user to put on a webserver, which is definitely inconvenient). As we mentioned in the call today, the implmentation for "polling" vs "wbesockets" is the same and they only differ for the transport mode - a since parameter is useful for when connection drops and it's possible to query logs only from a certain point on.

An even better solution than polling and websockets is HTTP2 Server Push

f-f commented 1 year ago

In addition to or instead of a public 'logs' endpoint I was thinking we'd have a /logs/ page where the logs can be viewed.

It would be nice to have a nice interface to the Registry, but I think it's fairly orthogonal to just getting an API up - we can expand the scope later once the API is up and stable

f-f commented 1 year ago

Re moving authed operations to the server: I'm a bit concerned about the double layer of auth as that's a lot of security surface exposed, but I don't have strong opinions otherwise, just think that it's nice to have things visible here on GitHub.

purescript / registry-dev

Deploy registry server #578