pypi / warehouse

The Python Package Index
https://pypi.org
Apache License 2.0
3.58k stars 965 forks source link

Trusted publishing: support for Cirrus CI #14542

Open abravalheri opened 1 year ago

abravalheri commented 1 year ago

This is inline with requests such as #13575 and #13888

Cirrus CI supports OpenID connect tokens (presented to the user via the $CIRRUS_OIDC_TOKEN environment variable) and an user can personalise the audience by setting the $CIRRUS_OIDC_TOKEN_AUDIENCE environment variable.

Hopefully this is enough for PyPI to interact with it? Is there anything else that would be necessary?

di commented 1 year ago

That's a large part of what's necessary!

The next step would be figuring out which of the "Cirrus Added Claims" would be specific to a given PyPI project (e.g. for GitHub, this is a combination of the owner/repository/workflow claims). At a glance, this would probably be owner/owner_id, repository/repository_id, and perhaps task_name/task_id?

After this, adding the integration to allow PyPI to trust the OIDC token and for PyPI users to configure Cirrus CI as a Trusted publisher would be next steps.

abravalheri commented 1 year ago

At a glance, this would probably be owner/owner_id, repository/repository_id, and perhaps task_name/task_id?

Considering I define something like the following (untested, just for the sake of brainstorming):

# .cirrus.yml
check_task: ...

build_task: ...

test_task: ...

publish_task:
  name: publish (Linux - 3.10)
  container: {image: "python:3.10-bullseye"}
  depends_on: [check, build, test]
  only_if: $CIRRUS_TAG =~ 'v\d.*' && $CIRRUS_USER_PERMISSION == "admin"
  env:
    CIRRUS_OIDC_TOKEN_AUDIENCE: pypi
    TWINE_REPOSITORY: pypi
    TWINE_USERNAME: __token__
  install_script: pip install tox
  prepare_script: <...>
  get_token_script:
    - resp=$(curl -X https://pypi.org/_/oidc/cirrus-ci/mint-token -d "{\"token\": \"${CIRRUS_OIDC_TOKEN}\"}")
    - api_token=$(jq '.token' <<< "${resp}")
    - echo "TWINE_PASSWORD=${api_token}" >> $CIRRUS_ENV
  publish_script:
    - ls dist/*
    - python -m twine upload dist/*

I would expect the token to have platform, owner and repository respectively as github, pypi, wharehouse (using this repo as example). With that, it should be possible to identify the repository/hosting identification. (The _id variants are internal to Cirrus and may not be directly translatable to publicly available information).

The task_name would be something like publish, so it is not unique per-build. If something unique per build is required, build_id can give you that...

di commented 1 year ago

I think we'd want something akin to the GitHub Actions workflow filename that is consistent for every build, which I think task_name probably is (e.g., we would want ensure that the "publish" tasks are able to authenticate but "lint" tasks do not).

The _id fields would not be provided by the user but would be set at configuration time automatically to protect against resurrection attacks.

I'm curious why you include platform, seems like this would only be "Github" for now. Wouldn't we want a publish workflow to succeed regardless of what the upstream source repository is?

abravalheri commented 1 year ago

I'm curious why you include platform, seems like this would only be "Github" for now. Wouldn't we want a publish workflow to succeed regardless of what the upstream source repository is?

Yes, you are right, I got a bit confused. I thought that platform would give you info from where the code is coming from, but if Cirrus will only add that value for GitHub, then it does not make sense to consider...