actions / cache

Cache dependencies and build outputs in GitHub Actions
MIT License
4.57k stars 1.21k forks source link

How to do fine-grained caching: bulk APIs? #1280

Open huonw opened 1 year ago

huonw commented 1 year ago

We're trying to turbo-charge our builds via fine-grained caching with the Pants build system. Pants recently gained experimental support for using the GitHub Actions Cache as a fine-grained "remote cache", to see the benefits discussed in https://dev.to/benjyw/better-cicd-caching-with-new-gen-build-systems-3aem, where we can reuse test runs and build artefacts from previous runs, while only downloading exactly what's required.

However, we find it doesn't work well in practice for us, even on a moderate sized repository, because doing fine-grained caching quickly hits rate limits (having to upload and/or download thousands of small "files" via individual requests). https://github.com/pantsbuild/pants/issues/20133

Are there any bulk APIs or other recommendations for how to best do the following:

  1. Check whether several cache entries exist
  2. Upload several new cache entries
  3. Download several cache entries

Alternatively, some other way to use the cache for many small requests.

This might benefit more than just Pants, e.g. https://github.com/mozilla/sccache also has a GHA cache backend, but hits some errors like this (https://github.com/mozilla/sccache/issues/1485).

(I asked this question of support (#2409822), and they told me to ask here instead, even though it's not directly related to the code in this repo.)

Thanks!

github-actions[bot] commented 5 months ago

This issue is stale because it has been open for 200 days with no activity. Leave a comment to avoid closing this issue in 5 days.

huonw commented 5 months ago

Pants still sees users affected by this, e.g.: https://chat.pantsbuild.org/t/18821099/for-the-experimental-gha-remote-caching-new-in-2-20-https-ww#97b93a14-6ea3-4234-9ecb-35a68e1a70f2

achimnol commented 4 months ago

Looking forward to see the improvements as I'm also planning to adopt the fine-grained GHA cache for pantsbuild, as we are continuously hitting the cache size limit (10 GB per repo) and LFS caching gets degraded too often and quickly, resulting in the additional costs on the data packs.

thesayyn commented 3 months ago

We are also interested in this for Bazel implementations.