Closed tsibley closed 1 year ago
With this endpoint in place, we can ~easily extend it to also support pr/X
"versions", which would lookup the latest successful CI run for the given PR and download that.
I've been thinking about adding this functionality for a while, but https://github.com/nextstrain/cli/pull/248 today motivated me to do it.
I'm going to merge and deploy this before review, as it seems low stakes and is primarily an internal endpoint to aid our development, so the audience is small.
Deployed to canary. Tested it with:
curl -fsSL --proto '=https' https://nextstrain.org/cli/installer/linux \
| DESTINATION=/tmp/cli \
NEXTSTRAIN_DOT_ORG=https://next.nextstrain.org \
bash -s ci-build/3859193828
and got a 500 when downloading the tarball via next.nextstrain.org because the artifact download got a 403 (even though we're providing authorization):
2023-01-09T19:16:12.473547+00:00 app[web.1]: [verbose] [fetch] GET https://api.github.com/repos/nextstrain/cli/actions/runs/3859193828/artifacts (cache: undefined)
2023-01-09T19:16:12.575884+00:00 app[web.1]: [verbose] [fetch] 200 OK https://api.github.com/repos/nextstrain/cli/actions/runs/3859193828/artifacts (cache miss, timestamp 2023-01-09T19:16:12.575Z)
2023-01-09T19:16:12.583163+00:00 app[web.1]: [verbose] [fetch] GET https://api.github.com/repos/nextstrain/cli/actions/artifacts/501513693/zip (cache: undefined)
2023-01-09T19:16:12.639463+00:00 app[web.1]: [verbose] [fetch] 403 Forbidden https://api.github.com/repos/nextstrain/cli/actions/artifacts/501513693/zip (cache skip, timestamp null)
2023-01-09T19:16:12.640138+00:00 app[web.1]: [verbose] Sending InternalServerError: upstream said: 403 Forbidden error as JSON
2023-01-09T19:16:12.641472+00:00 heroku[router]: at=info method=GET path="/cli/download/ci-build/3859193828/standalone-x86_64-unknown-linux-gnu.tar.gz" host=next.nextstrain.org request_id=… fwd="…" dyno=web.1 connect=0ms service=173ms status=500 bytes=297 protocol=https
This could be a scope issue with the token we're using for next.nextstrain.org?
Ok, I think I've come to an understanding here.
During development, I tested locally with my standard "various and sundry" personal access token (classic) that's granted limited scope: just public_repo
. Downloading artifacts from the public nextstrain/cli repo worked fine.
The token we use for nextstrain.org has no scopes (because even public_repo
includes write access). This means it has a read-only view of only public resources. I thought this would be sufficient to download artifacts from a public repo, but it turns out not to be. This isn't documented anywhere as far as I can tell.
Both tokens are "classic" personal access tokens.
I tested using a new "fine-grained" token without any permissions granted, which I believe is supposed to be roughly equivalent to a "classic" token without any scopes granted. But there are clearly some differences, because this fine-grained token works for artifact downloading when the classic token doesn't.
So I think we want to replace the classic token with a fine-grained token (which is what GitHub generally recommends now anyhow). This would let us still use a single GITHUB_TOKEN
for nextstrain.org, while not granting it permissions/scopes we don't want for security reasons.
I thought this would be sufficient to download artifacts from a public repo, but it turns out not to be. This isn't documented anywhere as far as I can tell.
Note that the classic token we use can view information about an artifact:
GET https://api.github.com/repos/nextstrain/cli/actions/artifacts/501513693 HTTP/1.1
HTTP/1.1 200
content-type: application/json; charset=utf-8
content-length: 695
x-oauth-scopes:
x-accepted-oauth-scopes:
{
"id": 501513693,
"node_id": "MDg6QXJ0aWZhY3Q1MDE1MTM2OTM=",
"name": "standalone-x86_64-unknown-linux-gnu",
"size_in_bytes": 51091874,
"url": "https://api.github.com/repos/nextstrain/cli/actions/artifacts/501513693",
"archive_download_url": "https://api.github.com/repos/nextstrain/cli/actions/artifacts/501513693/zip",
"expired": false,
"created_at": "2023-01-06T23:45:43Z",
"updated_at": "2023-01-06T23:45:45Z",
"expires_at": "2023-04-06T23:17:36Z",
"workflow_run": {
"id": 3859193828,
"repository_id": 139047738,
"head_repository_id": 139047738,
"head_branch": "trs/singularity-runtime",
"head_sha": "d435db68160b6a45277b1ee72006a5e16090259c"
}
}
just not download it:
GET https://api.github.com/repos/nextstrain/cli/actions/artifacts/501513693/zip HTTP/1.1
HTTP/1.1 403
content-type: application/json; charset=utf-8
content-length: 168
x-oauth-scopes:
x-accepted-oauth-scopes:
{
"message": "You must have the actions scope to download artifacts.",
"documentation_url": "https://docs.github.com/rest/reference/actions#download-an-artifact"
}
Despite the error response saying the actions
scope is required, that is not a documented scope for personal access tokens (which are OAuth tokens).
The download endpoint documentation says:
Anyone with read access to the repository can use this endpoint. If the repository is private you must use an access token with the
repo
scope. GitHub Apps must have theactions:read
permission to use this endpoint.
Our classic token has read access to the repository, so should have access per this doc. The repo is not private. The classic token is a personal access token, not a GitHub Apps token, so should not require the actions:read
permission.
I replaced the GITHUB_TOKEN
used by next.nextstrain.org with a new fine-grained token as described above:
and all seems to be working there. I'll make the same change to nextstrain.org soon, and eventually revoke the classic token.
One thing to note is that fine-grained tokens must have expiration dates ≤1y in the future, so this token expires 9 Jan 2024, and we'll have to manually rotate it before then. Not sure the best way to track this task…
With this endpoint in place, we can ~easily extend it to also support
pr/X
"versions", which would lookup the latest successful CI run for the given PR and download that.
Implemented as https://github.com/nextstrain/nextstrain.org/pull/645.
This new endpoint allows the standalone installer to install not just released versions but also the builds produced by arbitrary CI runs. That's very helpful for development and testing of PRs. With this new endpoint, for example, we can run:
to install /tmp/cli/nextstrain from:
Artifacts from GitHub Actions workflow runs require a bit more ceremony than release assets, as all artifacts come wrapped in a ZIP file, which we need to unwrap server-side for our installer. Doing this server-side also resolves the issue of artifacts requiring authentication to download (despite that our artifacts are publicly visible). Keeping the additional complexity of API requests, authentication, and additional compression out of the installer itself keeps the installer simpler and thus more robust for end users.
Testing
curl
and the standalone installer pointed at my local server