Switch to GitHub Actions

mig5 commented 5 months ago

Follow-up from https://github.com/freedomofpress/securedrop-workstation-ci/pull/40

We currently use the Github Webhook python library to consume the webhook event and payload from Github.

We also use requests to post commit statuses back to Github (with a link to the log file, as the final status), using a PAT, and also to post Slack notifications.

We might want to switch to Github Actions to invoke the runners, both to replace the 'webhook' event that fires right now, as well as for nightly scheduled jobs.

Doing so would provide the benefit of:

Convenient 'retry' option
Logs in Github rather than (just?) at https://ws-ci-runner.securedrop.org/
Integrated notifications to Slack
Reduced moving parts (we could remove the Github webhook python flask component and probably the separate 'nightlies.py' script)
Commit statuses would (probably?) be built-in as part of the action

It would require either:

Letting Github SSH into the bastion runner, or
Running a self-hosted Github Actions runner
'cat'ing the log file that the dom0 generates, because in VMware, even with vmtoolsd, you can't capture the stdout of a running process - we can only monitor for its return code or if it is running. But we fetch the file from dom0 at the end anyway, so we can easily just cat it to stdout and then the Github Actions service would 'see' the log.

There may be other security considerations here, or other things that would need to change as part of this. Note the implications about using a self-hosted runner with a public repository which is not recommended.

legoktm commented 4 months ago

There may be other security considerations here, or other things that would need to change as part of this. Note the implications about using a self-hosted runner with a public repository which is not recommended.

There's a good summary of the security considerations/threat model at https://github.com/actions/runner/issues/494#issuecomment-1101387027. In short, when someone sends a PR, they can also edit .github/workflows and/or the code it invokes to execute arbitrary stuff. (Because github.com jobs run in their isolated, one-time-use VMs, it's not an issue for normal GHA workflows).

The easiest way to avoid this is to limit who can run CI:

Screenshot 2024-04-19 at 15-40-24 legoktm_test

If someone outside FPF submits PRs, it would require a maintainer to approve their PR to run CI. (It's not clear to me what happens if they update their PR - ideally it would re-require approval.)

This has pros/cons from the status quo. Currently it only triggers on pushes inside the repository, so for an outside contribution, a maintainer has to pull down your code and re-push it to a branch in the repository. This makes it more convenient since a maintainer just has to click a button for CI to run. The downside is that "safe" jobs are also gated on this approval, which they aren't currently.

legoktm commented 4 months ago

It would require either:

Letting Github SSH into the bastion runner, or

To expand on this, we would give GitHub Actions an SSH private key that can log into the bastion, and then it would be able to invoke run.py just like how the webhook/nightlies.py currently does.

Note: AIUI, we don't have to worry about outside collaborators doing malicious things, PRs from forks cannot access secrets (https://securitylab.github.com/research/github-actions-preventing-pwn-requests/). So it would be the same as the status quo, only branches inside the repo can trigger this job.

From a security perspective, GitHub.com can already trigger/execute malicious code on the VMWare/Qubes part of CI, because it controls the git repo. In this implementation, GitHub.com would now have arbitrary code execution as the ws-ci-runner user on the bastion in addition. (There are ways we can harden/lock down what this user can do, but point is, it now has local code execution on the bastion.)

This should get us all of the benefits in the ticket description while retaining most of the existing architecture.

I will also suggest an alternative version of this for completeness, which is:

Let GitHub Actions login to/access VMWare directly

This is a further step from above, in which the code executed on the bastion runs inside the GitHub Actions process. As noted above, GitHub.com already has code execution on the VMWare/Qubes part. The main difference is that VMWare would now have to be exposed to GitHub.com directly, instead of through the bastion.

These two options are not mutually exclusive, we could start with SSH into the bastion and then gradually do more and more in the GHA workflow and give it access to VMWare to reach the second stage.

Running a self-hosted Github Actions runner

So I haven't really tried this out yet, the main advantage of this is we wouldn't have to run all the commands over ssh, etc. it would be a bit more native. I think there are a lot more downsides to this from a maintenance perspective, because we (infra and/or SD) now have to maintain the runner and its software (it does auto-update fwiw). And because the secret thing is different, we would need to enable the setting to manually approve outside collaborators, which affects all jobs.

lsd-cat commented 4 months ago

Referencing https://github.com/freedomofpress/infrastructure/issues/4788#issuecomment-2076632183 here, I am against exposing the full ESX web interface on the internet due to exploits for the web-interface being a not so rare thing in the past. In addition, when it was exposed during the setup phase, we got constant brute force which triggered the account auto-lock and we need to reset the bruteforce counter via SSH every time before logging in again.

However, we can find something that makes it simpler compared to SSH or vouch if that is the main concern, either proxy just the API with additional auth, or a Wireguard tunnel.

mig5 commented 4 months ago

Another option is we could allow-list all the Github actions IP ranges (though there are a helluva lot of them) - https://api.github.com/meta

freedomofpress / securedrop-workstation-ci

Switch to GitHub Actions #41