actions / setup-python

Set up your GitHub Actions workflow with a specific version of Python
MIT License
1.59k stars 505 forks source link

Caching doesn't seem to save that much time #881

Open hamirmahal opened 4 weeks ago

hamirmahal commented 4 weeks ago

Description: Caching seems to only skip the downloading step. It would be nice if it skipped Collecting and Using altogether, since that could maybe save a lot more time.

Action version: v5

Platform:

Runner type:

Tools version:

Run actions/setup-python@v5
  with:
    cache: pip
    python-version: [3](https://github.com/hamirmahal/cache-pip-install/actions/runs/8963287619/job/24613359851#step:3:3).11
    check-latest: false
    token: ***
    update-environment: true
    allow-prereleases: false
Installed versions
  Successfully set up CPython (3.11.9)

Repro steps:
https://github.com/hamirmahal/cache-pip-install/actions/runs/8963287619/job/24613359851#step:4:1

Expected behavior: Caching saves a lot more time.

Actual behavior: Caching only saves about 6s or so on an install step that otherwise takes a minute.

hamirmahal commented 4 weeks ago

https://github.com/hamirmahal/cache-pip-install/commits/continue-caching-with-setup-python/

hamirmahal commented 4 weeks ago

Initial Run

Run / python-program (push) Successful in 1m Details

Cached Run

Run / python-program (push) Successful in 54s Details

HarithaVattikuti commented 3 weeks ago

Hello @hamirmahal Thank you for creating this issue. We will investigate it and get back to you as soon as we have some feedback.

hamirmahal commented 3 weeks ago

You're welcome @HarithaVattikuti.

kurtmckee commented 3 weeks ago

@hamirmahal I'm not a developer but want to respond to your ticket.

Diagnosis

The pip install -r requirements.txt step in your workflow benefits from caching. However, the setup-python action doesn't control pip's behavior, and cannot reduce the number of "Collecting" and "Using" lines that you're seeing.

However, "Collecting" and "Using" aren't actually consuming much time -- it's the installation itself that consumes the vast majority of time. You can verify this by reviewing the raw logs, which contain a timestamp for every output line:

image

Looking at those raw logs, pip spends ~6 seconds (from 2024-05-06T02:41:05.2881468Z to 2024-05-06T02:41:11.4118495Z) printing "Collecting" and "Using" lines. It then spends ~26 seconds (from 2024-05-06T02:41:11.4118495Z to 2024-05-06T02:41:37.9340734Z) actually installing your dependencies.

Best practice

My recommendation is to disable setup-python's pip caching entirely and focus exclusively on caching an entire virtual environment.

    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        id: setup-python
        with:
          python-version: "3.11"

      # Write the exact Python version to a file for cache-busting.
      - run: |
          echo "${{ steps.setup-python.outputs.python-version }}" > ".installed-python"

      # THIS! This is where you'll save all your time!
      # Never cache pip dependencies! Cache virtual environments!
      - uses: "actions/cache@v4"
        id: "restore-cache"
        with:
          key: "venv-${{ hashFiles('.installed-python', 'requirements.txt') }}"
          path: |
            .venv/

      # If Python 3.11.x upgrades to 3.11.y, or if requirements.txt gets updated,
      # the cache lookup above will miss, and the venv needs to be recreated.
      - name: "Create a virtual environment"
        if: "steps.restore-cache.outputs.cache-hit == false"
        run: |
          python -m venv .venv
          .venv/bin/python -m pip install --upgrade pip setuptools wheel
          .venv/bin/python -m pip install -r requirements.txt

      - run: .venv/bin/python src/main.py

Caching an entire virtual environment is going to save you a ton of time. The only thing you need to watch out for is cache-busting. The example above busts the cache based on the exact Python version (like "3.11.7") and based on your requirements.txt file. If you're running this workflow on multiple platforms you'll need to include that in the cache key, too.

@HarithaVattikuti I think that this ticket can be closed; the report is referring to pip behavior, not setup-python behavior.