sigstore / root-signing-staging

Staging TUF repository for Sigstore trust root
https://tuf-repo-cdn.sigstage.dev/
Apache License 2.0
3 stars 5 forks source link

Deployment gating #54

Open jku opened 4 months ago

jku commented 4 months ago

Currently there is no (human) deployment gating: if the tests pass, a new repository is published whenever there are changes whether those changes are timestamp updates or more meaningful changes.

Some questions to answer (for both staging and production):

haydentherapper commented 4 months ago

Assuming our tests are sufficient, and we test at least one newer Sigstore client and Cosign, no gating long-term seems reasonable.

In the short-term, while we're gaining confidence over the process, I would gate for root updates only. Timestamp updates seem safe enough to automatically deploy.

Separate question, do we expect target file updates outside of root updates? Does tuf-on-ci allow for that? Previously, these would always happen at the same time, given that we had to gather root key holders. If we do expect that, maybe we also gate on target file updates, so that we can confirm that the target file works in whatever context it's used in.

jku commented 4 months ago

Separate question, do we expect target file updates outside of root updates? Does tuf-on-ci allow for that?

Yes this is allowed. Multiple roles metadata can be changed in one signing event but that is not guaranteed in any way.

(a future feature request is likely an easy way to force role resigning: currently you'll get separate signing events when a role is about to expire -- it would make sense if maintainers could easily decide that "let's resign targets now as well since we're all signing root already") https://github.com/theupdateframework/tuf-on-ci/issues/198

jku commented 4 months ago

I've got a POC running in https://github.com/jku/tuf-on-ci-sigstore-test:

jku commented 4 months ago

I will have to refactor the deployment to two separate pieces:

this did not work either: environments cannot be used with a "uses:" job.

There are now two working approaches:

jku commented 3 months ago

Documenting the state of things:

jku commented 2 months ago

Copying some content from the draft PR before I close it :

The flow currently looks like this:

graph TD;
    merge{"signing event<br/>(merges to 'main')"}-->online-sign;
    online-sign-period[online role in signing period]-->online-sign["online signing<br/>(merges to 'main' and 'publish')"];
    online-sign-->publish-pages[publish to Pages];
    publish-pages-->test-pages[test Pages with clients];
    test-pages--this is the critical point where we can add manual review-->publish-gcs[publish to GCS];
    publish-gcs-->test-gcs[test GCS with clients];    

The takeaways on branches and deployments are

So the question is this: will maintainers do additional manual testing using the Pages-published repository before publish to GCS if we give them the chance?

Potential human review flow

graph TD;
    merge{"signing event<br/>(merges to 'main')"}-->online-sign;
    online-sign-period[online role in signing period]-->online-sign["online signing<br/>(merges to 'main' and 'publish')"];
    online-sign-->publish-pages[publish to Pages];
    publish-pages-->test-pages[test Pages with clients];
    test-pages--if only timestamp changes-->publish-gcs-light[publish timestamp to GCS];
    test-pages--if additional metadata or targets changes-->deployment{Deployment review<br/>or delay};
    deployment-->publish-gcs-full[publish full repository to GCS];
    publish-gcs-light-->test-gcs[test GCS with clients];    
    publish-gcs-full-->test-gcs[test GCS with clients];