Policy validation process for cosign policy

lukehinds commented 2 years ago

[sget change] sget should be able to validate previous policies for consistent proof of previous signers.

[cosign change] signatures will be moved from the policy.json and place into a cosign annotation.

[cosign change] We add in previous root reference by digest blob of the policy to the new policy.

lukehinds commented 2 years ago

@imjasonh @jyotsna-penumaka @asraa

imjasonh commented 2 years ago

So the general idea is that cosign policy init would write only the policy (maintainers, threshold) to an OCI registry, then cosign policy sign would effectively become cosign sign reg.io/policy@digest, which pushes reg.io/policy:sha256-digest.sig as normal. Policy signatures effectively become the same as cosign signatures, which is nice.

A benefit is that, when a policy changes, it immediately invalidates all the signatures attached to it, because they all point to the previous policy's digest.

When a policy is updated (cosign policy edit or something), the new policy should include a pointer to the previous policy's digest -- "previousRoot": "sha256:digest" -- which verifiers can lookup if they want to ensure that the policy wasn't edited in some malicious way.

If the policy doesn't have a previousRoot defined, verifiers should still lookup in Rekor to see if there had been any previous policies defined, since a malicious policy pusher might not have included it to cover their tracks.

lukehinds commented 2 years ago

A benefit is that, when a policy changes, it immediately invalidates all the signatures attached to it, because they all point to the previous policy's digest.

Would this mean we lose the previous sigs?

For example; Bob run policy init and then signs the policy. Bill (a 2nd maintainer) then signs the policy, would they remove Bobs sig?

Or are we thinking only the policy creator signs the policy?

imjasonh commented 2 years ago

Would this mean we lose the previous sigs?

Nope, when a policy changes the previous sigs would still be present at reg.io/policy:<previous-policy-digest>.sig, and the previous policy content would be available at reg.io/policy@<previous-policy-digest> (unless the registry has GCed it)

When a second person signs the policy they get appended to reg.io/policy:<current-policy-digest>.sig, and don't replace existing signatures, just like with cosign sign today.

luhring commented 2 years ago

When a policy is updated (cosign policy edit or something), the new policy should include a pointer to the previous policy's digest -- "previousRoot": "sha256:digest" -- which verifiers can lookup if they want to ensure that the policy wasn't edited in some malicious way.

If the policy doesn't have a previousRoot defined, verifiers should still lookup in Rekor to see if there had been any previous policies defined, since a malicious policy pusher might not have included it to cover their tracks.

I think this question is rooted in my need to learn more about how these policies work 🙊 but I'll ask anyway... What is the advantage of having new policy versions point to the prior version, if consumers should check Rekor even when previousRoot is absent? And could a malicious actor go ahead and add the previousRoot field and value to avoid suspicion?

imjasonh commented 2 years ago

A totally reasonable question! (At least, to me, someone who may also need to learn more about how these policies work)

AIUI the previousRoot isn't intended to replace the need to lookup the previous policy in Rekor, it's basically just another bit of information that's available at policy edit-time that can provide added assurance that something isn't Afoot. Verifying clients like sget should still keep looking up in Rekor and report/fail if anything is missing.

lukehinds commented 2 years ago

Another question, I can't recall what (or if) we decided on how the signature of the artifact should be handled?

I presume we will run `cosign sign-blob and as long as there are maintainer signatures greater than the threshold within the policy, we can then move to checking previousRoot instances?

So effectively there will only be one manifest (the root policy)?

lukehinds commented 2 years ago

I started to sketch out what would be in the cosign issue here [0], please all weigh in with views / amendments:

https://docs.google.com/document/d/1CavafvqJxWrlFf9FidF0LZ-YjKNUIs5LvTuqWQbbZPs/edit#

imjasonh commented 2 years ago

Yeah, I think we should walk through the overlap with cosign upload-blob and cosign sign-blob -- is the blob we're signing the script.sh? Or the policy for script.sh?

When the policy held the signatures, it was easier to tell, but now I think we can actually simplify the UX and get rid of cosign policy sign.

The flow could be:

Maintainer writes script.sh, shares it (cosign upload-blob script.sh reg.io/project) --> emits script blob digest
Maintainer inits a policy for script.sh (cosign policy init --blob script.sh reg.io/project --maintainers=X,Y,Z --threshold=2), which is attached to the script's manifest, like an SBOM*
Maintainer X signs the blob -- not the policy (cosign sign-blob script.sh reg.io/project)
sget reg.io/project script.sh finds the blob and policy, checks whether the script's attached signatures meet the policy; they don't, so sget fails.
Maintainer Y signs the blob (cosign sign-blob script.sh reg.io/project)
sget reg.io/project script.sh finds the blob and policy, now the signatures do satisfy the policy, download and run.

* Does this mean we allow multiple policies attached to a blob? 🤔

NB: In the absence of a policy (omitting Step 2 above) sget could check for any valid signature, and Step 4 would run the script.

This starts to make the script blob flow more similar to the image sign/verify flow -- you sign the blob, not the policy -- and makes policies more easily applicable to images, so other verifying clients (e.g., cosigned) can start to align with sget. I'm not sure if this fundamentally undermines policies though, since they're only consulted during verification, and aren't signed anymore.

asraa commented 2 years ago

AIUI the previousRoot isn't intended to replace the need to lookup the previous policy in Rekor, it's basically just another bit of information that's available at policy edit-time that can provide added assurance that something isn't Afoot. Verifying clients like sget should still keep looking up in Rekor and report/fail if anything is missing.

Just on the flip side, as I've thought of this -- if for whatever reason you need to totally re-write a policy (let's say, some tragic event where the maintainers all disappear), then you would need to overhaul the root policy. We probably want the UX to handle this like "Verified up to blah point"

I'm not sure if this fundamentally undermines policies though, since they're only consulted during verification, and aren't signed anymore.

This is the only thing I'm curious about digging in to. On one hand, you have some sort of trust in the people who have push access to the repository. They might have implicit "root". However, maintainers Y and Z never agreed to be part of this policy, i.e. no one else with push access explicitly allowed maintainer X to define the policy. Trust needs to start somewhere... but I don't know the answer beyond everyone signing the policy, and clients having an initial way of "trusting" the starting users.

Just send a request to access the doc!

asraa commented 2 years ago

One other thing: we should probably make this policy format re-useable -- say you generate it offline and want to upload it, or you are given it out of band and want to run cosign verify --policy X $IMG (say you're a customer who wants to run verification with the policy on a particular image)

@jedisct1 defined a rules format for signing wasm modules here and we can re-use the top section which is pretty similar to our list of keys, and it has the added benefit of supporting ands/one-ofs. Plus also individual target scripts can have different requirements, rather than forcing all repository contents on one policy.

lukehinds commented 2 years ago

One thing we need to keep in mind is a lot of users will already have the install script in their repository. It's going to feel arduous for them to have the script available in change control git and yet need mess about with uploading blobs etc.

I expect the dominant UX will be:

I am user with my install script already inside the repository. I inform my users to run curl https://raw.githubusercontent.com/blah/install.sh | bash. I like this as its very easy for me to maintain.

I would like to sign this and have my users download it safely and then execute

I might want to allow multiple users to sign my install script as my project has several maintainers (but it could be just one individual).

imjasonh commented 2 years ago

It's hard to argue with the existing UX's ease of use. I also wonder if requiring maintainers to copy it to some registry is just going to be too high a bar. rget (archived) lets users fetch scripts/artifacts from any URL, relying on the tlog.

Something like that would prevent a moving reference from working (e.g., https://raw.ghc.com/foo/bar/master/script.sh) but that's a pretty terrible practice anyway, users should fetch from a tag or commit SHA.

Crazy stream-of-consciousness ideas follow, I don't know if any of them are good:

A more rget-like sget could transparently fetch the URL and validate the contents against the tlog, then mirror it in an OCI registry and remember to go to their registry for future fetches. That solves the UX issue a bit, and transparently opts the user into the best practice of keeping a local copy in case GitHub is down (unless your registry is also down).

To define a maintainer-signoff policy in that mode, we could have some convention around a policy.json file alongside the artifact in the repository, which we'd also verify against the tlog.

We couldn't store signatures in the repository alongside the artifact+policy, or else the commit SHA would change from the tag, or you'd have to move the tag. This also prevents policy.json from being changed, but maybe we just say if you want to change the policy you have to change the tag and make people fetch that instead? 🤔

(All these cases assume the GitHub raw content server behavior; these limitations wouldn't apply if you're just serving script+policy+signatures from https://myserver.com)

sigstore / sget-rs

Policy validation process for cosign policy | sget #44