Add `clean` function to `Repository`

heartsucker commented 7 years ago

Should ideally take in a reference to a Tuf struct and then use that to identify which metadata and targets are trusted or not. If something isn't trusted, purge it. Perhaps with a signature like this.

fn clean(&mut self, tuf: &Tuf) -> Result<()>

heartsucker commented 7 years ago

Thought: this would be difficult / annoying with the consistent snapshot feature as all possible valid combinations would have to be tried.

trishankkarthik commented 7 years ago

A simple way to garbage collect unreferenced metadata with consistent snapshots is to maintain a separate key-value store of (metadata files) to (the number of times they are referenced in any consistent snapshot). Whenever you remove a consistent snapshot, you decrement the count for all[1] metadata files in that snapshot. If the count is 0, you may safely delete the file.

A similar scheme can be used to keep track of targets / packages / images themselves, except now you will probably have to walk over all targets metadata files in a snapshot to find all the targets in it.

One important exception is that you never delete root metadata files, because old clients may want to update from an old root of trust to the newest one, and they would need the intermediate root metadata files for that.

Having said this, you may want to keep a permanent archive elsewhere of all metadata and targets / packages / images ever produced, for auditing purposes.

Cc: @JustinCappos @vladimir-v-diaz

[1] Easy to find the metadata files in that snapshot: besides the timestamp and snapshot metadata files, all targets metadata files are listed in the snapshot metadata.

heartsucker commented 7 years ago

@trishankkarthik Yes, I knew to keep the roots around, but with the usage of Tuf::from_root_pinned, even deleting all the roots is "safe" because the original trusted keys are hard coded into the binary. This means that a fully correct chain could be built from zero local metadata.

As for doing ref-counting/GC on the rest of it, that's a pretty slick approach. I'll keep that in mind whenever I get around to this.

I'm not planning on keeping the old metadata around because I see it as clutter and I don't have a reasonable approach to auditing it yet. Should someone want that, I'd ask them to open a ticket.

trishankkarthik commented 7 years ago

with the usage of Tuf::from_root_pinned, even deleting all the roots is "safe" because the original trusted keys are hard coded into the binary. This means that a fully correct chain could be built from zero local metadata.

Sorry, I don't understand this. Does this mean that the root keys will never be changed? What happens if a car has v1.root.json, goes offline for 10 years, and now there's v10.root.json? Can the car go from v1 to v10 without the intermediate root metadata files on the repository?

I'm not planning on keeping the old metadata around because I see it as clutter and I don't have a reasonable approach to auditing it yet. Should someone want that, I'd ask them to open a ticket.

Yes, I should have clarified that this is a TUF deployment consideration, rather than implementation specification. This is definitely beyond the scope of rust-tuf!

heartsucker commented 7 years ago

Does this mean that the root keys will never be changed?

I mean that assuming all x.root.json are available on some mirror somewhere, it would in theory be safe to delete old metadata locally. Root keys are pinned, so fetching 1.root.json from untrusted sources is safe because it can still be verified. I am planning on keeping the old roots around for convenience or to avoid bricking a client if there no longer exist anywhere any copies of some x.root.json in the chain.

trishankkarthik commented 7 years ago

Got it, thanks for the clarification. Could you elaborate what you mean by "root keys are pinned"? Is it using the same mechanism from TAP 5?

We haven't discussed this use case in TAP 5 yet, but we have discussed it in a paper we wrote called Trident. It explains how, for example, ATS can use a specially-crafted root metadata file for an ECU that prevents the remote repository from replacing the root keys automatically. We'd be happy to share the paper with you.

Is this what you have in mind? If so, you should consider what happens when it is necessary to replace the root keys. It won't be automatic, and will necessitate a manual recall.

Cc: @JustinCappos

heartsucker commented 7 years ago

Could you elaborate what you mean by "root keys are pinned"?

I don't think so. Let me give a more full example.

Acme. Co wants to use TUF, so they generate 3 keys using this lib. These keys have key IDs caluculated using the functions calc_key_id. The IDs are 12ab, 34cd, and 56ef. These keys are hard coded into the binary as a static value. At runtime, the client fetches (locally or remotely, it doesn't matter) 1.root.json. The client verifies that this is trusted only using keys whose IDs calculated calc_key_id match those in the static value.

Thus, the trusted keys are "pinned" and a client could receive a spoofed 1.root.json, but without access to the private keys, the client would not trust it.

This pinning is only used for 1.root.jsonand the usual chaining described in the spec is used to hop from x.root.json to (x+1).root.json.

And yeah, send the paper to one of my email addresses. I'll take a peek.

trishankkarthik commented 7 years ago

I see, thanks! This is an interesting way to bootstrap the root metadata file. I am assuming this is because it's, for some reason, hard to include a good copy of 1.root.json on the ECU in the first place?

heartsucker commented 7 years ago

This lib is general purpose and doesn't make assumptions about how someone deploys the software. It might be someone wants to ship the smallest possible binary in which case they would want to only pin the root keys.

This also defends against an attack where the attacker is able to modify the writable part of the image / partition but not the binary. This means an attacker could replace any/all of the metadata, but the client would reject it on every start because it wouldn't match the pinned keys or wouldn't be able to be verified by reconstructing the chain.

Here's an example of how to bootstrap a TUF client using this lib: https://github.com/heartsucker/rust-tuf/blob/ddf4d5dbd560f75727de6baf02fefa8cafa9e44a/src/lib.rs#L12-L61

theupdateframework / rust-tuf

Add `clean` function to `Repository` #85