sigstore / fulcio

Sigstore OIDC PKI
Apache License 2.0
648 stars 138 forks source link

Support a "neutral" IdP #444

Open znewman01 opened 2 years ago

znewman01 commented 2 years ago

Context:

  1. rubygems/rfcs#37: resistance to sigstore adoption based on concerns about privacy and "vendorization"
  2. #371 (non-OIDC email support): using an "email verification flow" to have email logins instead of OIDC logins is dicey -- we'd be performing "OIDC-over-SMTP" (plus no support for 2FA etc.)

Proposal

One potential resolution to these concerns is a "neutral" OIDC IdP, perhaps run by a nonprofit or some other non-corporate entity, with strong support for user privacy and security. This allows tying artifact signatures to a privacy-friendly identity (perhaps a pseudonym). The OpenSSF package manager security not-quite-working group is very invested in this, as it would substantially ease sigstore adoption in this setting.

Scope

This is a huge issue in that actually fixing it might involve standing up a production-quality service with an on-call rotation, etc. Hopefully that wouldn't all be organized here.

Let's scope this issue to discussing:

I'd prefer to discuss the following elsewhere:

Edit: fixed link to #371 (thanks @jchestershopify!)

jchestershopify commented 2 years ago

A nitpick, the issue you probably meant is #371.

haydentherapper commented 2 years ago

Discussing only the Fulcio aspect, I see no reason why we wouldn't be willing to add an IDP into either Dex or directly into Fulcio's IDPs, as long as the IDP abides by the list of requirements discussed in #397. I've already chimed in on my thoughts around another global IDP, namely as long as identities are scoped by domain, we minimize risk.

It's also important to recognize the significant effort behind running an identity provider with accounts - Account management, MFA, etc. It's worth considering the tradeoff between managing that service in comparison to key-based signing, which is another option for non-identity based signing.

Is there a particular topic you're looking to get more clarity on that's not already been discussed in other issues?

znewman01 commented 2 years ago

It's also important to recognize the significant effort behind running an identity provider with accounts

(Possibly off-topic) I have an off-the-wall idea for an "anonymizing OIDC proxy." Basically, it would be *both* an OIDC client and provider. The flow: 1. You initiate the Fulcio flow and choose "login with OIDCProxy.com" 2. It opens a window, which prompts you to login to OIDCProxy.com. You choose "login with GitHub" 3. You complete that flow; OIDCProxy.com then completes the IdP flow for OIDCProxy.com identity == H("GitHub" || your GItHub OIDC ID || salt) This gives you consistent, private identifiers (provided the salt is consistent), but it does mean that a leaked salt is a big deal. You could stick it in KMS or something which helps a little. The OIDC service can then be pretty simple doesn't have to keep track of accounts, do 2FA, etc.) and can be stateless.

[...]in comparison to key-based signing, which is another option for non-identity based signing.

Agreed; this is also my recommendation for now.

Is there a particular topic you're looking to get more clarity on that's not already been discussed in other issues?

Ha, fair question :) Mostly I'm looking to:

  1. Centralize discussion (corollary: move it out of issues which can stand on their own)
  2. Track progress: we're encountering demand for such an IdP from various package managers and it is slowing adoption somewhat. It'd be nice to at least have a place to point folks to let them know that there's interest and (hopefully soon) effort.
haydentherapper commented 2 years ago

You complete that flow; OIDCProxy.com then completes the IdP flow for OIDCProxy.com identity == H("GitHub" || your GItHub OIDC ID || salt)

How do you write verification policy for such a scheme? This is my main concern with any pseudonym-based identity. As the original signature creator, you can verify your own signatures since you know what was used to construct the hash. As a verifier, you either need:

The next step would be distributing the hash pseudonymous identity publicly so individuals or ecosystems can create a policy. Options could include:

Distributing the pseudonym identities via the ecosystem opens up another can of worms. 1) The ecosystem must now support adding per-user pseudonym identities. 1) Does a developer simply provide their hash to the ecosystem? There's no way to verify that link (which is somewhat the point, but it means the ecosystem must assume that the developer is not compromised and they're providing a pseudonym identity that they actually control). 1) What happens if the ecosystem is compromised and the pseudonym identities are modified along with the signatures are modified? This would not be detectable via a policy. Whereas if an ecosystem creates policy based on identities it controls, or if it's based on publicly-discoverable identities like an email, then this would be detected (because the signature's identity would not be the identity of the maintainer in either case)

I don't think pseudonym identities are the way to go currently for an ecosystem due to the difficulties with creating verification policies. It's fine for a small developer who doesn't need to write policy at scale, but at that point, I'd encourage GitHub or key-based signing.

Agreed; this is also my recommendation for now.

One other good option is signing originating from GitHub Actions. The certificate is tied to the identity of the GitHub Actions workload, with no user identifiers. I've also begun work on using GitLab as an alternative, since they now support OIDC. This should mitigate some concerns of vendor lock-in.

znewman01 commented 2 years ago

Distributing the pseudonym identities via the ecosystem opens up another can of worms.

I think these issues can all be mitigated. I'll describe the language package manager setting for clarify.

The ecosystem must now support adding per-user pseudonym identities.

Correct; I don't see this as a dealbreaker.

Does a developer simply provide their hash to the ecosystem? There's no way to verify that link (which is somewhat the point, but it means the ecosystem must assume that the developer is not compromised and they're providing a pseudonym identity that they actually control).

The idea is that the package repository would support OIDC logins, and you could do your initial login/publication for a package manager as 0xdeadbeef@oidcproxy.com.

What happens if the ecosystem is compromised and the pseudonym identities are modified along with the signatures are modified? This would not be detectable via a policy. Whereas if an ecosystem creates policy based on identities it controls, or if it's based on publicly-discoverable identities like an email, then this would be detected (because the signature's identity would not be the identity of the maintainer in either case)

These pseudonyms would be public (which has its own downsides—metadata can link packages written by the same pseudonymous author), so changes could be detected (possibly using a transparency log or similar). The goal here is consistency: I don't learn that package X was created by developer Y, but I do know that package X was signed by the same developer that signed it yesterday.

I still don't think this is necessarily a good idea, just because "package X written by pseudonymous author Y" is a little bit of a sketchy proposition. Though maybe that's good information for downstream users to have.

One other good option is signing originating from GitHub Actions. The certificate is tied to the identity of the GitHub Actions workload, with no user identifiers. I've also begun work on using GitLab as an alternative, since they now support OIDC. This should mitigate some concerns of vendor lock-in.

👍

haydentherapper commented 2 years ago

These pseudonyms would be public (which has its own downsides—metadata can link packages written by the same pseudonymous author), so changes could be detected (possibly using a transparency log or similar). The goal here is consistency: I don't learn that package X was created by developer Y, but I do know that package X was signed by the same developer that signed it yesterday.

This is an interesting point. This means that package verification becomes trust-on-first-use (TOFU) - A verifier pins an identity, and as long as it doesn't change, it's trusted. I think we should aim to avoid TOFU where possible, because it gives attackers an opportunity for compromise for first-time consumers.

This is also tricky because it makes it hard to change the identity of the signer. Let's say a project transfers maintainers, or maybe simply a set of maintainers rotates who signs the package. There may be a legitimate reason for the identity of the signer to change, but a verification policy can't distinguish between that and a compromise.

"package X written by pseudonymous author Y" is a little bit of a sketchy proposition

Yea, it's a far weaker claim than "package X is written by the same author that published the resource", and the artifact's author is auditable outside of the ecosystem.

znewman01 commented 2 years ago

I think TUF techniques address both of those problems:

This means that package verification becomes trust-on-first-use (TOFU) - A verifier pins an identity, and as long as it doesn't change, it's trusted.

If we use TUF to manage the delegation to pseudonyms and distribute the trust root with the download tool, it's not really TOFU. And, as in the usual TUF context, a transparency log can help here. The "consistency" property is over the lifetime of the repo, not the verifier's local state.

There may be a legitimate reason for the identity of the signer to change, but a verification policy can't distinguish between that and a compromise.

I think using TUF to manage delegation to pseudonyms would allow for this; you could imagine "a pseudonym can sign over rights to another identity" or "the targets role may delegate to a new identity."

Fundamentally, I see a pseudonym using such a proxy as exactly the same as a login from a standard OIDC provider using a pseudonymous account, inheriting the usual suite of problems and workarounds. This fits nicely with my philosophy of "Fulcio should allow many identity providers, but most consumers should only accept a limited subset of 'golden' IdPs" (but less well with a philosophy like "Fulcio should be quite discriminating about accepting IdPs, so every Fulcio signature has at least some baseline amount of legitimacy").

Caveat emptor, of course. I still think that a true privacy-respecting IdP operated by an organization with a lot of trust would be better, or just having the repository manage pubkeys themselves.

The certificate is tied to the identity of the GitHub Actions workload, with no user identifiers.

In some sense, I'd argue this is just using GitHub Actions as the anonymizing proxy.

haydentherapper commented 2 years ago

I think where I'm getting stuck on is that initial step of an identity delegating trust to a pseudonym. Without an authentication mechanism (e.g. OIDC), it'll be hard to associate an identity with a pseudonym in such a way that prevents anyone (someone not in control of the identity) from creating that association.

I'm just concerned about the complexity of such an approach in comparison to its benefit.

most consumers should only accept a limited subset of 'golden' IdPs

I definitely agree. Given that verification policies aren't well fleshed out though, I'm very hesitant to add any more IDPs that are globally scoped. Adding domain-scoped IDPs are much safer.

In some sense, I'd argue this is just using GitHub Actions as the anonymizing proxy.

The primary difference is the ability to verify the releaser is clearly linked to the release. We could imagine a pipeline where a release is signed by the GitHub identity and is pushed directly to the package repository through Actions. If the package repository is aware of the source repo, a verification policy can be set up to verify the signed release.

znewman01 commented 2 years ago

Without an authentication mechanism (e.g. OIDC), it'll be hard to associate an identity with a pseudonym in such a way that prevents anyone (someone not in control of the identity) from creating that association.

We can use an authentication mechanism: you'd create/log-in to your PyPI[^1] account using OIDC, but instead of using Google as your IdP, you'd use OIDCProxy.com as your IdP (which would use Google as its IdP). The important thing is that we keep track of the identity (pseudonymous or otherwise) used to initially publish a package, and enforce that it remains the same.

[^1]: I see @di is watching this thread, so picking on you :)

I'm just concerned about the complexity of such an approach in comparison to its benefit.

Despite vociferously defending this approach, I totally agree. 😄

I think it would be just a proof-of-concept until a sufficiently compelling use case came along.

The primary difference is the ability to verify the releaser is clearly linked to the release.

Got it. So this would just be GitHub asserting "this build comes from this repo at this SHA," rather than asserting anything about continuity-of-identity for the maintainer, that makes sense.

di commented 2 years ago

Well, you @-mentioned me, so now you get a reply. 🙂

My two cents: This seems like it's just shifting complexity from artifact providers to whoever would run a neutral IdP, and possibly increasing complexity as a result.

Here's the flow I'm imagining if artifact providers are also identity providers:

  1. User provides an OIDC identity token (this might be from GitHub Actions, Google, Microsoft, etc) to the artifact provider.
  2. The artifact provider (e.g. PyPI, npm, RubyGems) can decide to delegate a project identity accordingly. How?
    • For GitHub Actions, it knows which workflow identities are allowed to assume the project identity
    • For email-based IdPs, it knows which emails are associated with a given project
  3. The artifact provider provides an identity for the project (not the user!) that can then be used to sign the artifact
    • This identity won't change as maintainers/environments change
  4. The user can optionally also sign the artifact with any additional identities they have available
    • e.g. the GitHub Actions identity, their personal Google identity, etc
  5. The user can publish the artifact & corresponding signatures
  6. Consumers can easily verify the signing identities:
    • the artifact provider signature should correlate to the project name
    • any build environment signature should correlate to a known build environment (e.g. GitHub Actions)
    • presumably some out-of-band verification of personal identities could happen

Compare this with (what I understand to be) the flow for a neutral IdP:

  1. User logs into the neutral IdP
  2. Via the neutral IdP, user logs into another IdP (Google, GitHub, etc)
  3. This provides an non-neutral identity token to the neutral IdP
  4. The neutral IdP provides the user with a 'neutral' identity
  5. The user signs the artifact with the neutral identity (note: up until this point, we haven't actually gotten any information from the artifact provider)
  6. The user publishes the artifact & corresponding signatures
  7. Somehow, the artifact provider knows which 'neutral' identities are allowed to sign a given artifact
  8. Consumers have to assume that the first 'neutral' identity to publish is the trusted identity

Which raises the following questions for me:

I see TUF being proposed to manage the link between artifact -> neutral identity, but what about the reverse? How does the artifact provider know to accept signatures from one neutral identity and reject them from another in the first place, unless this is TOFU?

If the project changes hands, does the 'neutral' identity change or stay the same? If it stays the same, how does the 'neutral' IdP correlate the old and new maintainer identities? If it changes, how does a user distinguish between the project changing hands between trusted maintainers and a compromise where someone who shouldn't be able to sign an artifact has become able?

In the event that we're signing in a non-interactive environment (e.g. GitHub Actions) how does the 'neutral' IdP correlate a build environment identity with a 'neutral' identity, unless the 'neutral' IdP also maintains and stores the same links between a project and its build environment (that the artifact provider would already be maintaining anyway)?