sigstore / scaffolding

Stuff to make standing up sigstore (esp. for testing) easier for e2e/integration testing.
Apache License 2.0
57 stars 56 forks source link

Proposal: Make the TUF server more useful for production deployment #1182

Closed bkabrda closed 4 days ago

bkabrda commented 1 month ago

Description

Hi folks, as a followup to the discussion that happened in https://github.com/sigstore/scaffolding/pull/1159, I wanted to make the following proposal to improve the TUF server. I would be curious what are your thoughts; I'd definitely love to work on this myself to push things forward.

Desired state

I would like to make the TUF server usable in production environments. My aim is to support both k8s and non-k8s environments (as noted in https://github.com/sigstore/scaffolding/pull/1159). To achieve that, there are several things that I think should happen when the TUF server starts:

This would mean we:

Implementation details

The implementation would be very similar for both k8s and non-k8s environments. Roughly speaking:

CC @haydentherapper @jku

jku commented 1 month ago

I don't see anything unreasonable here. I will note that this sentence:

Have a way for the user to grab the generated TUF repository and continue operating it (add more keys, etc).

.. may include much more work than expected (or alternatively may provide a bad user experience).

For background on this, I'll list the high level options I see since sometimes they seem to be forgotten and only details are discussed (not saying this is the case here at all, just that this tends to happen):

  1. we can improve the "off-band trust root" use case (i.e. no TUF in play at all):
    • if clients have easy way that e.g. system admins can use to just insert trust roots onto devices with device management, then their problem is solved
    • even with non-managed devices, if users are fine handling trust root installation (every once in a while) manually then they can use the same mechanism
  2. We can easily make a set of files that looks like a TUF repository (and operates like one from client perspective) but never expires and is really a "write-once system":
    • the disadvantage is that this only looks like a real TUF repository, any trust root changes would require the user to re-initialize the client with a completely new TUF root
    • the advantage here is that clients will function 100% as they would with a real repository.
  3. Roughly the suggestion here: Make a set of files that operate like a TUF repository with very long expiry times
    • the difference to option 2 is that there is an "upgrade mechanism" to maintain that repository properly -- this would likely involve instructions to use a generic TUF metadata editing tool (like awslabs/tough tuftool)
    • the expiry times are expected to be very long even when maintained
  4. Make the "official" tuf-on-ci experience smoother so it's a viable option for more people (I say official but it is only used in root-signing-staging so far)

The reason I have not been advocating for solution 3 is that I have doubts of its reliability in practice:

I'm sure it's possible to use a tool like that safely (I assume AWS uses tuftool to maintain bottlerocket TUF repo) but it may require more discipline and knowledge then we can assume.


I have one question to make the line between options 3 and 4 clearer, if you don't mind: I think the defining features of tuf-on-ci are

Could you describe your use case more in these terms? Are all of the choices above issues for you or just some of them?

bkabrda commented 1 month ago

Thanks a lot for providing the context!

In terms of the options you outlined, I think you're absolutely right that most people aren't TUF experts and we definitely should aim for creating an initial TUF repository that expires in a very far future. However, I want to have the option for those people who do want to get familiar with TUF and who do want to manage the TUF repo themself - I totally agree that many won't but some may. So I want to kind of go for option 3, but with a huge disclaimer that people have to understand what they're doing, properly backup their TUF repository before making any changes etc.

In terms of tooling, you're absolutely right that tools like tough/tuftool allow you to break the TUF repository, for example tuftool will allow you to overwrite an old role file etc. I have plans to work with the maintainers of tuftool to set up these guardrails and provide experience which won't allow users to break their TUF repository (without using some sort of --force flag). I'm even thinking of doing a downstream redistribution of tuftool that would have Sigstore-specific builtin features to e.g. "add a new fulcio cert" with one command and do all the magic in the background.

To describe my use case in more detail: I'm an engineer working at Red Hat on a team called "Trusted Artifact Signer", which is, at its base, a downstream redistribution of Sigstore. We have only recently GAed and we don't really know how our customers use our product, but we have to provide something our users could manage the TUF root with. We're going for something very generic to allow them to tie in any usecase they might have. I really like tuf-on-ci, but it's currently limited in that it requires a specific kind of workflow that some organizations might not want to adopt. That is not to say that it can't be extended to other usecases, but right now it seems to me that a tool like tough could provide more versatility to tie in to any kind of workflow.

I think in the future, we might recommend tuf-on-ci as well as tuftool, if some of our customers are interested in this kind of workflow, but to be completely honest right now we just don't know.

Long term, I think the option 1 that you proposed is very interesting for us, because regardless of tools, TUF is hard and for some organizations might be an overkill. Once I have the feature proposed here implemented, I will definitely reach out and discuss other options - but again, I would like to gather our users' input and listen to their usecases before trying to lay out a proposal here.

Anyway, thanks a lot for your feedback, I will now start working on the proposed feature, time permitting :)

haydentherapper commented 1 month ago

I agree that option 1 is what I'd recommend for those who don't need to deal with TUF environments. You'll see examples of providing a trust root in https://github.com/sigstore/sigstore-python?tab=readme-ov-file#configuring-a-custom-root-of-trust-byo-pki.

I do think a TUF environment is valuable for providing a mechanism to manage key material and handle rotation/revocation. It might be worth also exploring https://github.com/repository-service-tuf/repository-service-tuf as another alternative for customers - For those on CI, have them look into tuf-on-ci, for those who want a managed service, RSTUF.

bkabrda commented 4 days ago

All the PRs for implementing this have been merged. I'm closing this issue - thanks for all the reviews!