Verify downloaded payloads

jinyus commented 10 months ago

Description

The tools grabs the tarball from the project's domain which leaves an opening for supply chain attacks should a bad actor gain control of the domain; not impossible.

https://github.com/inko-lang/ivm/blob/7cf2e1c73e4419ecf71d86588f6f515acdf0e02b/src/command/install.rs#L73

The archlinux PKGBUILD uses sha256sum and github as the source, which is harder to takeover (as it can't expire) and therefore, considerably safer than using ivm.

source=("https://github.com/inko-lang/ivm/archive/refs/tags/v${pkgver}.tar.gz")
sha256sums=('98d325646dd0e2a4425a32e8666edd82e9da0b7f1375401676c3946a2bf5ea10')

The 256hash could be added to releases as a text file to make this easier.

Related work

No response

yorickpeterse commented 10 months ago

@jinyus You're right, validation is something we really need to add. This does get tricky though: if we simply dump a text file on releases.inko-lang.org containing checksums, then the file is basically useless in case somebody hijacks the domain, as at that point they can supply a different file.

We can put the file some place else (e.g. in a GitHub repository), but that doesn't strictly solve the problem, it just moves it from one domain name to another. We can't encode it directly in ivm packages either, because then past versions of ivm can't install newer versions of Inko, until a new version of ivm is released containing the new checksum.

There are techniques like TUF, but I never quite figured out how you'd actually use it and what the implications are. I certainly don't want to be maintaining a bunch of signing keys and what not.

jinyus commented 10 months ago

@jinyus You're right, validation is something we really need to add. This does get tricky though: if we simply dump a text file on releases.inko-lang.org containing checksums, then the file is basically useless in case somebody hijacks the domain, as at that point they can supply a different file.

I meant the releases page in the main repo: https://github.com/inko-lang/inko/releases. This is what most bigger projects do.

Idea: A github workflow that publishes a new release with the tarball and txt file whenever a new tag is added. The workflow could also compile the project and add the linux binaries too. Should be simple enough... I could work on it if you don't have any objections.

edit: Looking at the existing workflows. It could be added here: https://github.com/inko-lang/inko/blob/main/.github/workflows/release.yml

yorickpeterse commented 10 months ago

@jinyus I don't want to host releases on GitHub, as I'd like our tooling to remain independent of GitHub (or any forge for that matter). This is also why e.g. our package manager supports any Git repository rather than just GitHub repositories.

Ignoring that, hosting the data elsewhere doesn't solve the core problem, it just shifts the responsibility from one party (e.g. me) to another (Github in this case). The fact that even Google forgets to renew their domain name suggests we can't blindly trust an organization just because it's large.

The way I see it, there are a few potential problems:

Domain name hijacking, followed by the domain redirecting to a malicious copy of e.g. releases.inko-lang.org.
A MITM attack that changes the package contents, without this being noticed by ivm
Somebody gaining direct access to the S3 bucket and fiddling with its contents
The data is simply corrupted as its downloaded. A checksum helps with this, but I think it's also quite rare to begin with

Domain hijacking isn't something you can truly prevent other than by ensuring domain names auto-renew, and you make sure this actually happens (e.g. by having enough money). In this case the domains are set up to auto-renew and payments happen through credit card, and Amazon is quite noisy about reminders. As such, I think this case is the least important to worry about, given it's a very rare and unlikely occurrence.

An MITM attack is less likely, as the bucket uses TLS and we don't ignore invalid TLS certificates. Thus, one would have to somehow inject their own certificate into your local certificate chain, at which point you have a lot more to worry about (e.g. your bank). As such, I'm not sure what more can be done here.

An attacker gaining direct access to the S3 bucket is more serious/likely, but I'm also not sure how to protect against that once it happens. For example, if we store signatures in the bucket an attacker can just change accordingly. We can ship the signatures out of bounds, but that works until that source is also breached (i.e a different bucket wouldn't really be better).

What is possible is that somebody gains access to the GitHub repository, which contains an AWS key pair used to upload data to the releases bucket. Said person could then trigger a CI job and use that to push malicious data. This is probably the most likely attack vector, and it's also why I'm very hesitant with giving anybody else push access to the repository. GitHub could improve upon this by scoping secrets to specific branches, which you can then build upon by not allowing anybody but maintainers to push to said branch. Unfortunately, this isn't supported.

The only workaround that I can think of is to have a separate repository for just pushing data to the releases bucket, then trigger a pipeline in it from the main repository when relevant. This however is also not quite an option, because linking the two together (so you can see progress) isn't possible unless you use a callable workflow, at which point the secrets need to be passed in from the calling repository. It's also just really annoying to work with.

In short, I think overall we're doing pretty good so far. GitHub expanding Actions a bit instead of focusing on the latest ML fad would help, but I suspect this won't happen any time soon. Until then, all I can think of are equally annoying-not-quite solutions :disappointed:

jinyus commented 10 months ago

I don't want to host releases on GitHub, as I'd like our tooling to remain independent of GitHub (or any forge for that matter). This is also why e.g. our package manager supports any Git repository

Fair enough.

inko-lang / ivm

Verify downloaded payloads #1

Description

Related work