pypi / warehouse

The Python Package Index
https://pypi.org
Apache License 2.0
3.6k stars 965 forks source link

Trusted publishing: Support for GitHub reusable workflows #11096

Open woodruffw opened 2 years ago

woodruffw commented 2 years ago

Our MVP implementation (#10753) assumes that the workflow is in the same repository, which is not necessarily true.

We should support reusable workflows, specifically via the job_workflow_ref claim.

woodruffw commented 2 years ago

Just as a note: this will have to interoperate with #11263.

GergelyKalmar commented 1 year ago

Any chance to expedite this issue? It seems there's many who would be eager to move over to trusted publishing now that it is the recommended approach, but not being able to do the flow from a reusable workflow is a pretty major blocker. This issue also does not seem to be documented anywhere so we only find out about it once we changed everything to use the new approach.

woodruffw commented 1 year ago

Any chance to expedite this issue?

I can't personally promise a timeline here, but it is on my backlog of things to do.

This issue also does not seem to be documented anywhere so we only find out about it once we changed everything to use the new approach.

Thank you for pointing this out; I'll make some changes to the docs to emphasize that reusable workflows are not currently supported.

woodruffw commented 1 year ago

Thank you for pointing this out; I'll make some changes to the docs to emphasize that reusable workflows are not currently supported.

I've opened https://github.com/pypi/warehouse/pull/13592 for this. Thanks again for bringing it to our attention!

webknjaz commented 1 year ago

@woodruffw I just stumbled upon an example of a reusable workflow actually working with OIDC here https://github.com/jorisroovers/gitlint/pull/486 — could you confirm this is because of secrets: inherit at https://github.com/jorisroovers/gitlint/blob/5247839/.github/workflows/ci.yml#L225 ?

GergelyKalmar commented 1 year ago

That workflow is in the same repository though, I think the issue is with reusable workflows that are in a different repository (e.g. in a separate public repo).

woodruffw commented 1 year ago

Yeah, I think that's because it's in the same repo, not because of secrets: inherit.

In other words: reusable workflows do work with the current implementation, just not reusable workflows that are external to the repository that the trusted publisher was configured with.

webknjaz commented 1 year ago

That workflow is in the same repository though, I think the issue is with reusable workflows that are in a different repository (e.g. in a separate public repo).

Yes, you can't use inherit across orgs. The issue was that the OIDC context has a reusable workflow reference and that's currently not checked IIUC. So my concern is that it might be possible to fool the verification logic in PyPI with this...

GergelyKalmar commented 1 year ago

You can get around the inherit issue between organizations via forking, which is what we would usually do. So same org, different repo would be fine. The point of reusable workflows in this case would be that you store them in a central repository and you can re-use them in other repositories.

webknjaz commented 1 year ago

I also wonder if it's possible to do something like

secrets:
  GITHUB_TOKEN: ${{ github.token }}

to make OIDC in the reusable workflow pretend to be the called workflow...

webknjaz commented 1 year ago

You can get around the inherit issue between organizations via forking, which is what we would usually do. So same org, different repo would be fine. The point of reusable workflows in this case would be that you store them in a central repository and you can re-use them in other repositories.

Totally! Though, I've found another use for structuring stuff in-repo: putting different components into separate files while having a centralized workflow that is complex but doesn't have a million lines of YAML in the same file.

webknjaz commented 1 year ago

Just as a note: this will have to interoperate with #11263.

@woodruffw do you have an idea of how this should work in general? Would there be an extra input in the trust form for trusting third-party workflow refs, or how would that need to be implemented?

woodruffw commented 1 year ago

Yeah, I've been thinking about this: the "simple" solution would be an extra input like you say, but that has implications for the way we match trusted provider metadata on the backend: every optional configuration point is another query dimension, meaning a linear best, geometric worst increase in queries needed to find the intended publisher configuration.

Rather than doing that, we could relax current "workflow" filename input to accept a reusable workflow reference -- we'd then detect that case, and specialize on it. I need to think more about whether that actually works, but I think it'll be simpler.

facutuesca commented 3 months ago

I've been looking into how we can implement this. I've written up the relevant facts about how GitHub and PyPI currently handle OIDC claims, and at the end I specify what adding support for reusable workflows would look like in light of those facts.

GitHub OIDC claims with (non-)reusable workflows

  1. Reusable workflows are workflows that can be called from another workflow
  2. There can be up to 4 levels of connected workflows, i.e: top-level-workflow.yml -> reusable-workflow-1.yml -> reusable-workflow-2.yml -> reusable-workflow-3.yml.
  3. These reusable workflows can be located in external repositories.
  4. When GitHub generates an OIDC token, the token will contain claims about the workflow that requested the token.
  5. When it's a reusable workflow, the claims will only include information about top-level-workflow.yml and reusable-workflow-3.yml. That is, only about the top-level and bottom-level workflows.
  6. The claims that contain the ref paths for those two workflows are workflow_ref and job_workflow_ref (top-level and bottom-level respectively). For example:
    {
        "workflow_ref": "org/repo/.github/workflows/top-level-workflow.yml@refs/heads/main",
        "job_workflow_ref": "org2/repo2/.github/workflows/reusable-workflow-3.yml@v1",
    }
  7. When the token is requested by a non-reusable workflow (a single workflow that doesn't call other workflows) those two claims will be equal, containing the path ref of that single workflow.
  8. There are two more claims for the commit hash of the workflows in the previous claims:
    {
        "workflow_sha": "c2698398906ab70572c7878a11debb44ff1d2241",
        "job_workflow_sha": "10040c56a8c0253d69db7c1f26a0d227275512e2",
    }
  9. There is one claim that contains the ref (e.g: refs/heads/main) corresponding to the top-level workflow, but not for the bottom-level workflow:
    {
        "ref": "refs/heads/main",
    }

PyPI verification of OIDC claims for GitHub workflows

  1. The user creates a Trusted Publisher by specifying a repository owner, a repository name, and a workflow filename (and optionally an environment, which we'll ignore for this discussion since it's not relevant).
  2. When PyPI receives an OIDC token, it checks its job_workflow_ref claim against the fields mentioned above. For example:

    {
        "job_workflow_ref": "org/repo/.github/workflows/release.yml@refs/heads/main",
    }

    will pass validation for a Trusted Publisher configured with:

    repository_owner = "org"
    repository_name = "repo"
    workflow_filename = "release.yml"
  3. PyPI will also check that the ref in jobs_workflow_ref (in the example above, refs/heads/main) matches the ref claim, which we saw in (9) corresponds to the top-level workflow.

Note how our checks in (11) and (12) mix claims about the bottom-level workflow (jobs_workflow_ref) with claims about the top-level workflow (ref). This is fine for non-reusable workflows, since the claims refer all to the same, single workflow. However, it's problematic when starting to consider reusable workflows.

Note also that we accidentally support a subset of reusable workflows: when the top-level and bottom-level workflow are located in the same repo, and the bottom-level workflow is called in the same branch/tag/sha as the top-level one, then the OIDC claims check will pass. Concretely, if the claims look like this:

        "workflow_ref": "org/repo/.github/workflows/top-level-workflow.yml@refs/heads/main",
        "job_workflow_ref": "org/repo/.github/workflows/reusable-workflow-3.yml@refs/heads/main",
        "ref": "refs/heads/main",

and the user configured the workflow filename as reusable-workflow-3.yml, then verification will pass.


Adding support for reusable workflows

So, what does all of this mean?

First, we want to change the existing OIDC claims verification so that it checks against the top-level workflow instead of the bottom-level one (from job_workflow_ref to workflow_ref). This should be transparent for most users, but it will break the Trusted Publishers that depend on the (accidental) support for reusable workflows mentioned above. In order to understand how many users this would impact, we will measure how many Trusted Publishers are currently being used with reusable workflows (see PR here: https://github.com/pypi/warehouse/pull/16364).

Second, in order to fully support reusable workflows we need the user to input 3 new pieces of information: the owner and name of the repository where the reusable workflow lives, and the filename of said workflow. Once we have that, we can compare incoming OIDC tokens against the old information (owner, repo and filename of top-level workflow) and against the new information (owner, repo and filename of the bottom-level workflow).

Third, note how we can trivially migrate existing Trusted Publishers without user intervention, by just copying the existing 3 fields into the new ones, since for non-reusable workflows they should be equal.

And fourth, from the web UI perspective, the user should not need to fill the 3 new fields when creating a Trusted Publisher for a non-reusable workflow. Rather, those new fields can be hidden behind a "Reusable workflow" checkbox, and when left empty they will default to the same values as the old 3 fields (for the same reason as the point above).

cc @woodruffw

mhils commented 3 months ago

Thank you for the detailed write-up, @facutuesca! 😃 🍰

Is there a reason why we cannot match on workflow_ref (the top-level workflow) only? What type of attack are we mitigating by checking jobs_workflow_ref as well?

facutuesca commented 3 months ago

Is there a reason why we cannot match on workflow_ref (the top-level workflow) only? What type of attack are we mitigating by checking jobs_workflow_ref as well?

The scenario we're trying to avoid is having overscoped Trusted Publishers (which is why we suggest having a dedicated, publish-only workflow). In the case of only checking for top-level workflows, imagine you have a project with a top-level ci.yml workflow that calls a bunch of other workflows (in the same repo), one of which is publish.yml:

ci.yml -> tests.yml
       -> lint.yml
       -> deploy.yml
       -> build.yml
       -> publish.yml

If we only match on workflow_ref (that is, ci.yml), that means all of the workflows there will be able to publish packages. By adding the extra constraint of job_workflow_ref, we can ensure it's only publish.yml.

webknjaz commented 3 months ago

@facutuesca and the environment name matching remains the same, right?

jaraco commented 3 months ago

Third, note how existing Trusted Publishers can be trivially migrated by copying the existing 3 fields into the new ones, since for non-reusable workflows they should be equal.

For my use-case, I've been holding off using trusted publishing in environments that have reusable workflows until this issue is solved, so I won't have any opinion on the migration path. If you think it would be valuable to employ trusted publishing on such a repo, I'd be open to exploring that and providing feedback on the migration path, but my preference would be to await a durable solution.

facutuesca commented 3 months ago

@facutuesca and the environment name matching remains the same, right?

@webknjaz Yes, the environment check will still be against the environment claim sent by GitHub


@jaraco

Third, note how existing Trusted Publishers can be trivially migrated by copying the existing 3 fields into the new ones, since for non-reusable workflows they should be equal.

For my use-case, I've been holding off using trusted publishing in environments that have reusable workflows until this issue is solved, so I won't have any opinion on the migration path. If you think it would be valuable to employ trusted publishing on such a repo, I'd be open to exploring that and providing feedback on the migration path, but my preference would be to await a durable solution.

Ah I think I wasn't super clear there. I was referring to the migration from the POV of PyPI: when adding the 3 new columns to the database, we can fill the values with a copy of the existing information (owner, repo, filename). This will be completely transparent to the users, they won't need to migrate or change anything. I'll reword it to make that point explicit, thanks!

mhils commented 3 months ago

If we only match on workflow_ref (that is, ci.yml), that means all of the workflows there will be able to publish packages. By adding the extra constraint of job_workflow_ref, we can ensure it's only publish.yml.

Thanks, that makes sense! I'm not fully convinced though:

  1. The problem you described applies just as much to non-reusable ci.yml workflows which may be just as overscoped, no? I don't see an inherent reason why reusable workflows would need additional checks.
  2. Adding an extra checkbox and three new fields adds complexity. Putting the PyPI implementation cost aside for a moment, this makes Trusted Publishing harder to set up for users.
  3. If the only goal is to enforce a narrower scope, why not nudge or require the use of environments for new configurations? That helps both non-reusable and reusable workflows. I feel this would be just as effective and would not require users to figure out which workflow path actually goes into which text input.

Is there anything terribly wrong with changing the check PyPI is currently doing from job_workflow_ref to workflow_ref? My understanding is that reusable workflows would then "just work" while being as secure as non-reusable ones[^1].

[^1]: The only thing that breaks would be the "accidental support for a subset of reusable workflows" mentioned in https://github.com/pypi/warehouse/issues/11096#issuecomment-2260686299. Depending on what the metrics show, this is either a non-issue or can be easily worked around with an additional check for this case.

facutuesca commented 3 months ago

If we only match on workflow_ref (that is, ci.yml), that means all of the workflows there will be able to publish packages. By adding the extra constraint of job_workflow_ref, we can ensure it's only publish.yml.

Thanks, that makes sense! I'm not fully convinced though:

  1. The problem you described applies just as much to non-reusable ci.yml workflows which may be just as overscoped, no? I don't see an inherent reason why reusable workflows would need additional checks.

True, but I would argue that a parent-level workflow will be overscoped almost by definition, whereas a non-reusable workflow can be made minimally scoped. The argument is: if you only check the top-level ci.yml that means that all current and any future workflows called from ci.yml, both internal but also from external repos, will be allowed to upload packages.

I think it makes sense there to give the user the ability to narrow it down to a single reusable workflow.

  1. Adding an extra checkbox and three new fields adds complexity. Putting the PyPI implementation cost aside for a moment, this makes Trusted Publishing harder to set up for users.

True, but only for those who want to set up reusable workflows (the three new fields would be hidden until the user checks the checkbox). So the user experience for the currently supported use case (non-reusable workflows) should be the same, and the added complexity is for the users who want to use the new feature (reusable workflows)

  1. If the only goal is to enforce a narrower scope, why not nudge or require the use of environments for new configurations? That helps both non-reusable and reusable workflows. I feel this would be just as effective and would not require users to figure out which workflow path actually goes into which text input.

This might be an option. IME though, most projects don't use environments, so making it required would (arguably) be harder for users since they would have to set it up. I'm open to it though, since I might be wrong here.

Is there anything terribly wrong with changing the check PyPI is currently doing from job_workflow_ref to workflow_ref? My understanding is that reusable workflows would then "just work" while being as secure as non-reusable ones1. The only thing that breaks would be the "accidental support for a subset of reusable workflows" mentioned in

As you mentioned, one issue is breaking currently working Trusted Publishers, which we'll see if it's actually a problem once we have those metrics. The other issue is the discussion we've been having above: if we only check for workflow_ref, repos that use reusable workflows will have a very overscoped Trusted Publisher (any workflows called by the top-level workflow, internal or external, current and future, will have permissions to upload).

mhils commented 3 months ago

True, but I would argue that a parent-level workflow will be overscoped almost by definition, whereas a non-reusable workflow can be made minimally scoped.

Reusable workflows can be made minimally scoped just as well. You may be right that folks are more likely to overscope here, but I don't have any data on this. 🙂

This might be an option. IME though, most projects don't use environments, so making it required would (arguably) be harder for users since they would have to set it up. I'm open to it though, since I might be wrong here.

FWIW I think security-wise environments would be the superior solution here. For example, we have (relatively speaking) lots of folks with write access to the main branch, but only a few selected individuals can approve deployments (and thus push to PyPI).

The one major downside I can think of is that environments are not available for free accounts in private repositories. So that makes it hard to enforce as a strict requirement. If that's a major concern, job_workflow_ref may be the better solution.

Thank you for pushing this entire thing forward! 🎉

GergelyKalmar commented 3 months ago

I'd definitely prefer the implementation where we can scope on the reusable workflow level, I think that the few optional fields in the setup UI would be totally fine and it's unlikely to confuse anyone.

Environments are great, but there are lots of organizations that already have an established CI/CD setup that does not use environments and have no need for it otherwise, so requiring it just to use this feature would be probably suboptimal.

GergelyKalmar commented 1 week ago

With attestation becoming a more and more important part of the publication process to PyPI I wanted to ask if it would be possible to expedite this work together with https://github.com/pypi/warehouse/issues/13911.

This issue makes it less convenient to use trusted publishing compared to the token-based approach (which works fine with reusable workflows already), while the lack of tag pattern matching described in https://github.com/pypi/warehouse/issues/13911 makes it difficult to achieve the same level of control and security over publishing as what the token-based approach provides.

woodruffw commented 1 week ago

With attestation becoming a more and more important part of the publication process to PyPI I wanted to ask if it would be possible to expedite this work together with #13911.

We're currently working on reusable workflow support. There are no short-term plans (on my part, at least) for #13911, however. That's going to require a larger re-thinking of the data model, since every additional optional field geometrically increases the number of states PyPI needs to check.