simonw / python-lib

Opinionated cookiecutter template for creating a new Python library
Apache License 2.0
174 stars 13 forks source link

Fix security issues with the Trusted Publishing example #8

Open webknjaz opened 5 months ago

webknjaz commented 5 months ago

https://github.com/simonw/python-lib/blob/4b825ed/%7B%7Bcookiecutter.hyphenated%7D%7D/.github/workflows/publish.yml#L44-L49 suggests that building the dists within the same job that publishes them is okay. But it's not. Such a structure opens the workflow users to privilege escalation through poisoning the build dependencies, which is why I've always insisted on the separation — the build scripts must never have access to id-token: write.

Another suggestion is to fix the GitHub Environment name to represent the deployment target as it's meant to. I usually go for pypi and testpypi so it's obvious that uploading to both is separate.

I saw release here https://github.com/simonw/python-lib/blob/4b825ed/%7B%7Bcookiecutter.hyphenated%7D%7D/.github/workflows/publish.yml#L33C5-L33C25, which is not an upload target but a process name which is very generic.

The declaration syntax can also be extended to include a URL:

-     environment: release
+     environment:
+       name: pypi
+       url: https://pypi.org/project/{% endraw %}{{ cookiecutter.hyphenated }}{% raw %}/${{ github.ref_name }}
simonw commented 1 month ago

I'm having trouble figuring out the recommended pattern here. Is this the best way to do. it? https://docs.github.com/en/actions/security-for-github-actions/security-hardening-your-deployments/configuring-openid-connect-in-pypi

jobs:
  release-build:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-python@v5
        with:
          python-version: "3.x"

      - name: build release distributions
        run: |
          # NOTE: put your own distribution build steps here.
          python -m pip install build
          python -m build

      - name: upload windows dists
        uses: actions/upload-artifact@v4
        with:
          name: release-dists
          path: dist/

  pypi-publish:
    runs-on: ubuntu-latest
    needs:
      - release-build
    permissions:
      id-token: write

    steps:
      - name: Retrieve release distributions
        uses: actions/download-artifact@v4
        with:
          name: release-dists
          path: dist/

      - name: Publish release distributions to PyPI
        uses: pypa/gh-action-pypi-publish@release/v1

In that example the build step happens in an entirely different job, which uploads the built asset as an artifact. Then a separate job with id-token: write downloads that artifact and posts it to PyPI.

Is there a simple pattern for this? I'm happy to go with that if it's the right thing to do.

simonw commented 1 month ago

Looks like that's the approach described here too: https://packaging.python.org/en/latest/guides/publishing-package-distribution-releases-using-github-actions-ci-cd-workflows/#checking-out-the-project-and-building-distributions

webknjaz commented 1 month ago

Yes, the PyPUG guide is mine. So it already shows what I believe is best: separate jobs, environments etc. The GH doc was authored by @woodruffw and we're trying to get more updates into it but their processes are very slow so I only hope it'll get merged this year.. GitHub's starter workflows also have a lot of outdated stuff (like pinning my action to a 4-5 year old version that results in people getting an unsupported version that doesn't have trusted publishing implemented). Fixing it is as challenging.

So this issue I filed is a part of my effort to present people who land here from Google with better defaults..

simonw commented 1 month ago

Here's the new workflow:

name: Publish Python Package

on:
  release:
    types: [created]

permissions:
  contents: read

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: ["3.8", "3.9", "3.10", "3.11", "3.12"]
    steps:
    - uses: actions/checkout@v4
    - name: Set up Python ${{ matrix.python-version }}
      uses: actions/setup-python@v5
      with:
        python-version: ${{ matrix.python-version }}
        cache: pip
        cache-dependency-path: pyproject.toml
    - name: Install dependencies
      run: |
        pip install '.[test]'
    - name: Run tests
      run: |
        python -m pytest
  build:
    runs-on: ubuntu-latest
    needs: [test]
    steps:
    - uses: actions/checkout@v4
    - name: Set up Python
      uses: actions/setup-python@v5
      with:
        python-version: "3.12"
        cache: pip
        cache-dependency-path: pyproject.toml
    - name: Install dependencies
      run: |
        pip install setuptools wheel build
    - name: Build
      run: |
        python -m build
    - name: Store the distribution packages
      uses: actions/upload-artifact@v4
      with:
        name: python-packages
        path: dist/
  publish:
    name: Publish to PyPI
    runs-on: ubuntu-latest
    if: startsWith(github.ref, 'refs/tags/')
    needs: [build]
    environment: release
    permissions:
      id-token: write
    steps:
    - name: Download distribution packages
      uses: actions/download-artifact@v4
      with:
        name: python-packages
        path: dist/
    - name: Publish to PyPI
      uses: pypa/gh-action-pypi-publish@release/v1

I tested it by creating a new package called python-lib-issue-8 in a private repo and publishing it to PyPI: https://pypi.org/project/python-lib-issue-8/

I'm going to delete that package from PyPI now though since it's effectively useless.

simonw commented 1 month ago

Before I delete the PyPI package this works:

$ python -m pip install python-lib-issue-8 
Collecting python-lib-issue-8
  Downloading python_lib_issue_8-0.1a0-py3-none-any.whl (6.1 kB)
Installing collected packages: python-lib-issue-8
Successfully installed python-lib-issue-8-0.1a0
$ python
Python 3.10.10 (main, Mar 21 2023, 13:41:05) [Clang 14.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import python_lib_issue_8
>>> python_lib_issue_8.example_function()
2
simonw commented 1 month ago

Project deleted:

CleanShot 2024-08-20 at 10 53 07@2x

simonw commented 1 month ago

@webknjaz does https://github.com/simonw/python-lib/issues/8#issuecomment-2299420075 look good to you? If so I'll duplicate it in my other cookiecutter templates - click-app and datasette-plugin and sqlite-utils-plugin and llm-plugin.

simonw commented 1 month ago

Is the if: startsWith(github.ref, 'refs/tags/') line there necessary? My publish.yml workflow already only triggers on this:

on:
  release:
    types: [created]
simonw commented 1 month ago

The declaration syntax can also be extended to include a URL:

-     environment: release
+     environment:
+       name: pypi
+       url: https://pypi.org/project/{% endraw %}{{ cookiecutter.hyphenated }}{% raw %}/${{ github.ref_name }}

What does including the url part in there actually do?

I have to admit I am very confused by GitHub Actions "environments" - I don't understand what they do, or what they are for.

webknjaz commented 1 month ago

@simonw so I recommend using specifically pypi in the name and rendering the url as shown. The URL is displayed in a few places on GH UI. For example, on the workflow run graph page. See the box with the publishing jobs @ https://github.com/cherrypy/cheroot/actions/runs/8786461123. Another place is the in-repo deployment pages, for example: https://github.com/cherrypy/cheroot/deployments/pypi (here you can also see that they are linked to commits and logs/CI runs).

GitHub has this thing called Deployments API. It's meant to represent project deployments into different environments. It's integrated with webhooks, and you can add the deployment information via API. GitHub Environments is one of the UI presentations of that, quite poorly named if you ask me. They are visible in the repository settings, where you can also configure additional protection rules. You can also have environment-scoped secrets, which are better security-wise, compared to the org- or repo-global ones. I used to use them all the time in the pre-trusted publishing world. Nowadays, I use the required reviewers feature, which allows having a pause in the release pipeline. If you apply an environment to a job @ GHA CI/CD, it will not start said job until the conditions are met. So with the required reviewers, you don't have to babysit the pipeline. You can trigger it, and it'll safely go through the unconditional jobs. Once it reaches the publishing job, it'll pause and send notifications to all required reviewers. Those notifications will lead back to the workflow run page, presenting a button on the UI that they need to click to approve the run. This is the last gate and a chance to do some final manual checks or have some extra wait if needed. If you choose to opt out of using the protection rules, it'll just start the job right away.

When configuring the trust on the PyPI side, you're asked to type in the environment name. If entered, PyPI will match it on upload and reject any attempts not originating from jobs with that environment selected. So this is another security aspect related to the feature.

To summarize, the GitHub Environments represent deployment targets — places where you put your dists. So it's only logical to name them for what they are. This is why I'm pushing for pypi and testpypi as they represent the publishing targets. Environments don't represent processes (like release). That's a misconception I'm seeing in many places.

If you were to use GitHub Pages with the modern GHA-based publishing method, they auto-create an environment called github-pages. Heroku also creates them. Here, I've found a few in your repositories:

N.B. Technically, you don't have to create these environments via GH UI in the repo settings. They are being auto-created on first use @ GHA. However, if you were to apply the protection settings I mentioned above, that would need access to the settings of repositories where you do this. Having them configured in the repository template will not complicate things for the users. If they skip configuring the environments, it'll still work.

webknjaz commented 1 month ago

does #8 (comment) look good to you?

In addition to what I explained above, I also noticed something else that is not strictly required but is usually nice to have — in your example, building the dists happens after testing. This means that your tests are running against something else. I like building before testing so that the tests could pick up the artifacts that are prepared for uploading and run against them instead (I even have a thing to help with that: https://github.com/re-actors/checkout-python-sdist). Though, this is a separate effort. There is no reason for making building depend on testing. Even if you don't integrate them into the testing jobs, you can still smoke-test builds in parallel with testing and the publishing job would depend on both build and tests.

Another thing to run continuously is twine check --strict. My action does that under the hood, so it's typically nicer to get alerted about problems in the dist metadata and not when publishing fails.