notaryproject / notary

Notary is a project that allows anyone to have trust over arbitrary collections of data
Apache License 2.0
3.22k stars 508 forks source link

“Detached signature”-like export and validation #422

Open mtrmac opened 8 years ago

mtrmac commented 8 years ago

In the TUF model, and with the way Notary keeps only the latest versions of metadata within ~/.notary, if the Notary server goes away (due to a hardware failure or intentionally), it is impossible to later directly prove that an image has been signed; and even though there may be an indirect proof (by knowing or assuming that the client was configured to only use properly signed images), keeping the original signed files around can be useful for forensic purposes (e.g. ruling out a client configuration change to disable signature verification, or figuring out which specific key has been used to sign the image, if the role allows using more than one).

In particular, if Docker is verifying signatures when pulling images, it seems prudent for Docker to archive the signature metadata along with the imported image.

For Notary, this would mean adding a facility to export exactly the files used for verification, probably as a single-file archive of {root,targets,snapshot,timestamp.json}, and a way to validate such an archive or verify a file against such an archive. (Such a complete and self-contained archive can be verified with only a knowledge of the trusted root key, i.e. just like GPG detached signatures can be verified.)

mtrmac commented 8 years ago

https://github.com/docker/docker/issues/19130 is the corresponding Docker issue.

endophage commented 8 years ago

The use case doesn't seem specific to docker and it makes sense to have a general way to archive data downloaded by notary. I think it makes sense to solely tackle this here in the notary repository.

mtrmac commented 8 years ago

The notary client library does not know how the validated data is used within the calling application (where the application stores the validated data or how it does logging—if the application is doing anything like that at all, consider the notary verify stdin→stdout filter), so it can not archive the signature in a way which makes sense for the application.

endophage commented 8 years ago

I believe your thinking is backwards. Notary has no idea how the calling application uses that validated data and that's irrelevant. We're not trying to prove anything about its use. We're trying to prove that at some point in the past it was signed in to the TUF repository as valid. The calling application has no idea how notary does the validation, so exporting the repository files at a point in time is not a useful feature.

Instead I'm suggesting that notary is responsible for archiving old versions of the repository that it downloads (probably with a configuration to disable archiving for those that don't need it). To then enable a client to determine if a known piece of data was valid according to some past version of the TUF repository, there are 2 mechanisms that have so far been discussed to enable the feature:

  1. On retrieving a target, the version of the timestamp.json is also provided. This is sufficient information to retrieve the correct repository and prove presence of a known checksum.
  2. Providing the checksum and notary searches all archived copies of the repository, returning the versions (maybe just a version) in which the checksum appeared.
mtrmac commented 8 years ago

We're trying to prove that at some point in the past it was signed in to the TUF repository as valid.

Yes (or to help figuring out why it was signed when it wasn’t supposed to, etc.)

The calling application has no idea how notary does the validation, so exporting the repository files at a point in time is not a useful feature.

The application does not need to understand the implementation; that’s what the notary’s Go API is for, and the application can happily treat it as a black box, like most people treat OpenPGP signatures as a black box.

I think exporting specific files as a specific proof is of a specific fact is actually quite useful. Most importantly because it is closer to how humans think about proof (an exported signature is “courtroom exhibit A”; a notary archive is “that computer where you must never touch anything and if anything breaks we don’t know what we lost”).

Secondarily,

Instead I'm suggesting that notary is responsible for archiving old versions of the repository that it downloads

if we are storing the data at all (as we must), the argument is basically about location, and structure of that storage, and I think the applications are better equipped to do this in a way which makes sense for the user (see https://github.com/docker/docker/issues/19130#issuecomment-169665998 ).

To then enable a client to determine if a known piece of data was valid…

[the application needs to keep around]

the version of the timestamp.json [or] the checksum

We can not automatically assume that an application will log or keep this data around. (The above-mentioned docker verify stores neither. At the moment, from memory, I am not certain that Docker will keep the SHA256-based digests around after migrating to some future hash.)

So, as long as the application needs to explicitly store something and care about preserving it unmodified, why not have the application store the complete proof instead of storing just a reference and hoping that an archive somewhere else has been preserved (e.g. that Docker backups include the Notary archive, presumably from a completely different location)?

endophage commented 8 years ago

Let's set aside what current commands do because we're talking about adding a new feature, for which there will be necessary changes, so current behaviour is moot within the bounds of avoiding breaking changes.

Maybe we are just haggling over location of the archive. If what you're describing is that the application gets an opaque blob that it can present back to notary at a later point to prove validation of a piece of data, then we can haggle over implementation details of what that blob looks like later (full TUF repo, timestamp version, checksum, something else).

Note that if the application cannot regenerate the checksum, this whole discussion is moot, as the checksum is required to prove that the data you currently have was in the repository. Therefore it's a fair assumption that the application has or can generate (really anything less than generation is no better than the implicit proof) the checksum.

I understood your initial suggestion to be that this data would somehow be verifiable without notary (library or CLI). I do not think that is a reasonable case. I think the title here is particularly misleading and it might make sense to update it to something more like "exportable validity proofs and verification". The idea of "detached signatures" is not workable in this context.

mtrmac commented 8 years ago

Maybe we are just haggling over location of the archive.

Mostly, yes (and vaguely whether it is full or partial or centralized or app-collocated).

If what you're describing is that the application gets an opaque blob that it can present back to notary at a later point to prove validation of a piece of data

Yes, that’s what I think this should look like to the application.

To be precise,

Note that if the application cannot regenerate the checksum, this whole discussion is moot, as the checksum is required to prove that the data you currently have was in the repository.

Right, good point.

I understood your initial suggestion to be that this data would somehow be verifiable without notary

Definitely without the original notary server; ideally without any other state (though the set of “definitely correct” root keys will have to come from somewhere outside).

The implementation would usually come from the Notary project, having the signature verifiable with other tools is something which should be possible in extreme circumstances (reimplementing TUF) but not the usual way to do things.

As for the issue title, feel free to change it as you like; right now it assumes a specific implementation approach (which we are still arguing about) rather than describing the general feature need.

Though

The idea of "detached signatures" is not workable in this context.

I don’t understand this. AFAICS from the verification POV, an archive of TUF metadata which signs a specific target behaves just like a GPG detached signature (which is why I like the analogy, it is already familiar to everyone). The analogy even works fine for the signing side (given a TUF root key it would be perfectly possible to create an one-off set of TUF metadata signing a single target, though that would completely defeat the point of timestamping), but to be clear, creating one-off stand-alone TUF signature archives is an absolute non-goal for me, this is all about archiving forensic data about signatures served by a Notary server.

endophage commented 8 years ago

OK, I think we're agreed so I'll just address the last bit about detached signatures. With your clarification, it makes sense: you're considering the entire TUF metadata collection to be a "detached signature". In the context of a TUF repo that itself is a set of signatures over data, it's a highly confusing term, and will become even more so if we ever add a GPG signer.

At a minimum, please update your initial post to clarify your definition of "detached signature" so anyone else reading it understands it up front.

mtrmac commented 8 years ago

Oops, the issue description indeed did not mention the connection at all. Now that I have tried, I feel less confident about using this term to make understanding easier, though it still seems better than nothing.