kr / hk

Fast Heroku client
https://hk.heroku.com/
77 stars 6 forks source link

hkdist should have no S3 creds #19

Closed kr closed 11 years ago

kr commented 11 years ago

From @tmaher:

Per hallway conversation, https://hk.heroku.com can dispense just regular SHA-256 hashes of the binaries. SSL cert validation of the hostname, on client side, is what gives you MitM protection.

Upon reflection, I would do one extra thing, which you might be doing already. In the event someone's able to pw0n the hk.heroku.com site, the attacker should not be able to coerce clients into downloading https://herbal-viagra.com/hk.exe . For something like local client code, it's serious enough that we should require two things to fail for this to go bad. The first approach that pops to mind is this:

Binaries should actually live in S3, and the client should talk to S3 directly. There should be some well-defined naming scheme for the binaries, e.g., https://hk-updates.s3.aws.com/hk-2013-01-09-rev1.exe , and hk.heroku.com should be constrained to only notifying the client that there's a new version (date: 2013-01-09, rev: 1, sha256: 0xbadc0ffee0ddfood...) Client should validate format of the data fields it gets back from hk. IAM keys that can modify the bucket are not accessible to hk.heroku.com.

If you've got alternatives that satisfy the "pw0ning hk.heroku.com isn't enough to pw0n clients" constraint, I am all ears.

kr commented 11 years ago

That makes sense.

Currently, hk.heroku.com issues http redirects to the s3 bucket, and the file name is its sha1 hash. I'll switch to sha-256 and make all the other changes you suggest here.

kr commented 11 years ago

Here's my plan:

Pwning either hkdist or s3 alone will be insufficient to inject arbitrary code into end users. But pwning either the release cutting box (e.g. my laptop or that random ec2 windows box) or hkgen alone would be sufficient to pwn users.

Why does hkgen exist? To generate diffs between releases. hk itself doesn't download the entire next release, just a diff between the current one and the next one. This gives typically a 90% file size savings. For hkgen to do its job it has to be able to create blobs and put them in s3 as well as update the database of valid sha256 hashes used by hkdist. Now that I think about it, I believe this spot for vulnerability can be mitigated if we have hkdist mediate all access to its db and, when the client asks it for a manifest, have it report sha256 hashes of the actual release binary rather than hashes of the diffs. The diffs are entirely derived information. If that's not good enough, we could have a separate db just for storing pointers to diffs, and hkgen would never have to talk to hkdist or the sensitive db at all. The diff db would contain optimization hints and never be trusted. Would that work? Is it even necessary?

That also leaves the build tool as a potential weak spot. Especially because eventually I'd like to automate that as much as possible so that cutting a release would involve basically just pushing a gpg-signed tag to github. But that automation will have to happen later. For now, it's still a command for me (or whoever) to run manually. I don't know if that makes it more or less secure than the automated version. I don't trust myself that much!

kr commented 11 years ago

Please disregard all that gnashing of teeth over hkgen. I was making this entirely too difficult. Hkgen needs no write access to any db, only to s3. We can even easily make it write to a completely untrusted s3 bucket.