Improve pre-built wheel CI times by only building llama.cpp once per platform

abetlen / llama-cpp-python

Python bindings for llama.cpp

https://llama-cpp-python.readthedocs.io

MIT License

7.8k stars 934 forks source link

Improve pre-built wheel CI times by only building llama.cpp once per platform #1505

Open abetlen opened 3 months ago

abetlen commented 3 months ago

Kinda self explanatory from the title, right now each python version for a given target builds llama.cpp independently. This artificially limits how many platforms we can support by blowing up ci build times.

Since we aren't actually linking against the python api at all each python version on any given platform is essentially building the same llama.cpp shared library. If we can cache or re-use a single pre-built library this should speed up ci build times significantly.

metal3d commented 3 months ago

There is a "cache" property in GitHub action AFAIK.

steps:
- ...
- uses: actions/setup-python@v4
  with:
    python-version: '3.11'
    cache: 'pip'

Not sure that it can be shared, but it's a nice feature.