nix-community / cache-nix-action

Cache Nix store in GitHub Actions to speed up workflows [maintainer=@deemp]
MIT License
29 stars 6 forks source link

Cache Nix action

A GitHub Action to restore and save (not only) Nix store paths using GitHub Actions cache.

This action is based on actions/cache.

What it can

A typical job

  1. The nix-quick-install-action installs Nix in single-user mode.

  2. Restore phase:

    1. The cache-nix-action tries to restore a cache whose key is the same as the primary key.

    2. When it can't restore, the cache-nix-action tries to restore a cache whose key matches a prefix in a given list of key prefixes.

    3. The cache-nix-action restores all caches whose keys match some of the prefixes in another given list of key prefixes.

  3. Other job steps run.

  4. Post Restore phase:

    1. The cache-nix-action purges caches whose keys are the same as the primary key and that were created more than a given time ago.

    2. When there's no cache whose key is the same as the primary key, the cache-nix-action collects garbage in the Nix store and saves a new cache.

    3. The cache-nix-action purges caches whose keys match some of the given prefixes in a given list of key prefixes and that were created more than a given time ago relative to the start of the Post Restore phase.

Limitations

Comparison with alternative approaches

See Caching Approaches.

Additional actions

Example steps

- uses: nixbuild/nix-quick-install-action@v27

- name: Restore and cache Nix store
  uses: nix-community/cache-nix-action@v5
  with:
    # restore and save a cache using this key
    primary-key: nix-${{ runner.os }}-${{ hashFiles('**/*.nix') }}
    # if there's no cache hit, restore a cache by this prefix
    restore-prefixes-first-match: nix-${{ runner.os }}-
    # collect garbage until Nix store size (in bytes) is at most this number
    # before trying to save a new cache
    gc-max-store-size-linux: 1073741824
    # do purge caches
    purge: true
    # purge all versions of the cache
    purge-prefixes: cache-${{ runner.os }}-
    # created more than this number of seconds ago relative to the start of the `Post Restore` phase
    purge-created: 0
    # except the version with the `primary-key`, if it exists
    purge-primary-key: never

Example workflow

See ci.yaml and its runs.

Configuration

See action.yml.

Inputs

name description required default
primary-key
  • When a non-empty string, the action uses this key for restoring and saving a cache.
  • Otherwise, the action fails.
true ""
restore-prefixes-first-match
  • When a newline-separated non-empty list of non-empty key prefixes, when there's a miss on the primary-key, the action searches in this list for the first prefix for which there exists a cache with a matching key and the action tries to restore that cache.
  • Otherwise, this input has no effect.
false ""
restore-prefixes-all-matches
  • When a newline-separated non-empty list of non-empty key prefixes, the action tries to restore all caches whose keys match these prefixes.
  • Tries caches across all refs to make use of caches created on the current, base, and default branches (see docs).
  • Otherwise, this input has no effect.
false ""
skip-restore-on-hit-primary-key
  • Can have an effect only when restore-prefixes-first-match has no effect.
  • When true, when there's a hit on the primary-key, the action doesn't restore caches.
  • Otherwise, the action restores caches.
false false
fail-on
  • Input form: <key type>.<result>.
  • <key type> options: primary-key, first-match.
  • <result> options: miss, not-restored.
  • When the input satisfies the input form, when the event described in the input happens, the action fails.
  • Example:
    • Input: primary-key.not-restored.
    • Event: a cache could not be restored via the primary-key.
  • Otherwise, this input has no effect.
false ""
nix
  • Can have an effect only when the action runs on a Linux or a macOS runner.
  • When true, the action can do Nix-specific things.
  • Otherwise, the action doesn't do them.
false true
save
  • When true, the action can save a cache with the primary-key.
  • Otherwise, the action can't save a cache.
false true
paths
  • When nix: true, the action uses ["/nix", "~/.cache/nix", "~root/.cache/nix"] as default paths, as suggested here.
  • Otherwise, the action uses an empty list as default paths.
  • When a newline-separated non-empty list of non-empty path patterns (see code>@actions/glob</code for supported patterns), the action appends it to default paths and uses the resulting list for restoring and saving caches.
  • Otherwise, the action uses default paths for restoring and saving caches.
false ""
paths-macos
  • Overrides paths.
  • Can have an effect only when the action runs on a macOS runner.
false ""
paths-linux
  • Overrides paths.
  • Can have an effect only when the action runs on a Linux runner.
false ""
backend

Choose an implementation of the cache package.

false actions
gc-max-store-size
  • Can have an effect only when nix: true, save: true.
  • When a number, the action collects garbage until Nix store size (in bytes) is at most this number just before the action tries to save a new cache.
  • Otherwise, this input has no effect.
false ""
gc-max-store-size-macos
  • Overrides gc-max-store-size.
  • Can have an effect only when the action runs on a macOS runner.
false ""
gc-max-store-size-linux
  • Overrides gc-max-store-size.
  • Can have an effect only when the action runs on a Linux runner.
false ""
purge
  • When true, the action purges (possibly zero) caches.
  • Otherwise, this input has no effect.
false false
purge-primary-key
  • Can have an effect only when purge: true.
  • When always, the action always purges cache with the primary-key.
  • When never, the action never purges cache with the primary-key.
  • Otherwise, this input has no effect..
false ""
purge-prefixes
  • Can have an effect only when purge: true.
  • When a newline-separated non-empty list of non-empty cache key prefixes, the action selects for purging all caches whose keys match some of these prefixes and that are scoped to the current GITHUB_REF.
  • Otherwise, this input has no effect.
false ""
purge-last-accessed
  • Can have an effect only when purge: true.
  • When a non-negative number, the action purges selected caches that were last accessed more than this number of seconds ago relative to the start of the Post Restore phase.
  • Otherwise, this input has no effect.
false ""
purge-created
  • Can have an effect only when purge: true.
  • When a non-negative number, the action purges selected caches that were created more than this number of seconds ago relative to the start of the Post Restore phase.
  • Otherwise, this input has no effect.
false ""
upload-chunk-size
  • When a non-negative number, the action uses it as the chunk size (in bytes) to split up large files during upload.
  • Otherwise, the action uses the default value 33554432 (32MB).
false ""
save-always

Run the post step to save the cache even if another step before fails.

false false
token

The action uses it to communicate with GitHub API.

false ${{ github.token }}

Outputs

name description
primary-key
  • A string.
  • The primary-key.
hit
  • A boolean value.
  • true when hit-primary-key is true or hit-first-match is true.
  • false otherwise.
hit-primary-key
  • A boolean value.
  • true when there was a hit on the primary-key.
  • false otherwise.
hit-first-match
  • A boolean value.
  • true when there was a hit on a key matching restore-prefixes-first-match.
  • false otherwise.
restored-key
  • A string.
  • The key of a cache restored via the primary-key or via the restore-prefixes-first-match.
  • An empty string otherwise.
restored-keys
  • A possibly empty array of strings (JSON).
  • Keys of restored caches.
  • Example: ["key1", "key2"].

Troubleshooting

Garbage collection parameters

On Linux runners, when gc-max-store-size-linux is set to a number, the cache-nix-action will run nix store gc --max R before saving a cache. Here, R is max(0, S - gc-max-store-size-linux), where S is the current store size.

Respective conditions hold for macOS runners.

There are alternative approaches to garbage collection (see Garbage collection).

Purge old caches

The cache-nix-action allows to delete old caches after saving a new cache (see purge-* inputs in Inputs and the compare-run-times job in the Example workflow).

The purge-cache action allows to remove caches based on their last accessed or created time without branch limitations.

Alternatively, you can use the GitHub Actions Cache API.

Merge caches

GitHub evicts least recently used caches when their total size exceeds 10GB (see Limitations).

If you have multiple similar caches produced on runners with the same OS (Linux or macOS), you can merge them into a single cache and store just it to save space.

In short:

  1. Matrix jobs produce similar individual caches.
  2. The next job restores all of these individual caches, saves a common cache, and purges individual caches.
  3. On subsequent runs, matrix jobs use the common cache.

See the make-similar-caches and merge-similar-caches jobs in the example workflow.

Pros: if N individual caches are very similar, a common cache will take approximately N times less space. Cons: if caches aren't very similar, run time may increase due to a bigger common cache.

Caching approaches

Discussed in more details here and here.

Caching approaches work at different "distances" from /nix/store of GitHub Actions runner. These distances affect the restore and save speed.

GitHub Actions

cache-nix-action

Pros:

Cons: see Limitations

magic-nix-cache-action

Pros (link):

Cons:

actions/cache

If used with nix-quick-install-action, it's similar to the cache-nix-action.

If used with install-nix-action and a chroot local store:

Pros:

Cons:

If used with install-nix-action and this trick, it's similar to the cache-nix-action, but slower (link).

Hosted binary caches

See binary cache, HTTP Binary Cache Store.

Pros:

Cons:

Garbage collection

When restoring a Nix store from a cache, the store may contain old unnecessary paths. These paths should be removed sometimes to limit cache size and ensure the fastest restore/save steps.

Garbage collection approach 1

Produce a cache once, use it multiple times. Don't collect garbage.

Advantages:

Disadvantages:

Garbage collection approach 2

Collect garbage before saving a cache.

Advantages:

Disadvantages:

Save a path from garbage collection

Garbage collection approaches

Contribute

Clone the repository.

git clone --recurse-submodules https://github.com/nix-community/cache-nix-action

Cache action

Tests

Documentation

See "Caching dependencies to speed up workflows".

What's New

v4

v3

See the v2 README.md for older updates.

Usage

Pre-requisites

Create a workflow .yml file in your repository's .github/workflows directory. An example workflow is available below. For more information, see the GitHub Help Documentation for Creating a workflow file.

If you are using this inside a container, a POSIX-compliant tar needs to be included and accessible from the execution path.

If you are using a self-hosted Windows runner, GNU tar and zstd are required for Cross-OS caching to work. They are also recommended to be installed in general so the performance is on par with hosted Windows runners.

Environment Variables

Cache scopes

The cache is scoped to the key, version, and branch. The default branch cache is available to other branches.

See Matching a cache key for more info.

Example cache workflow

Restoring and saving cache using a single action

name: Caching Primes

on: push

jobs:
  build:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - name: Cache Primes
        id: cache-primes
        uses: actions/cache@v4
        with:
          primary-key: ${{ runner.os }}-primes
          paths: prime-numbers

      - name: Generate Prime Numbers
        if: steps.cache-primes.outputs.cache-hit != 'true'
        run: /generate-primes.sh -d prime-numbers

      - name: Use Prime Numbers
        run: /primes.sh -d prime-numbers

The cache action provides a cache-hit output which is set to true when the cache is restored using the primary key and false when the cache is restored using restore-keys or no cache is restored.

Using a combination of restore and save actions

name: Caching Primes

on: push

jobs:
  build:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - name: Restore cached Primes
        id: cache-primes-restore
        uses: nix-community/cache-nix-action/restore@v5
        with:
          primary-key: ${{ runner.os }}-primes
          paths: |
            path/to/dependencies
            some/other/dependencies

      # other steps

      - name: Save Primes
        id: cache-primes-save
        uses: nix-community/cache-nix-action/save@v5
        with:
          primary-key: ${{ steps.cache-primes-restore.outputs.cache-primary-key }}
          paths: |
            path/to/dependencies
            some/other/dependencies

Note You must use the cache or restore action in your workflow before you need to use the files that might be restored from the cache. If the provided key matches an existing cache, a new cache is not created and if the provided key doesn't match an existing cache, a new cache is automatically created provided the job completes successfully.

Caching Strategies

With the introduction of the restore and save actions, a lot of caching use cases can now be achieved. Please see the caching strategies document for understanding how you can use the actions strategically to achieve the desired goal.

Implementation Examples

Every programming language and framework has its own way of caching.

See Examples for a list of actions/cache implementations for use with:

Creating a cache key

A cache key can include any of the contexts, functions, literals, and operators supported by GitHub Actions.

For example, using the hashFiles function allows you to create a new cache when dependencies change.

- uses: nix-community/cache-nix-action@v5
  with:
    primary-key: ${{ runner.os }}-${{ hashFiles('**/lockfiles') }}
    paths: |
      path/to/dependencies
      some/other/dependencies

Additionally, you can use arbitrary command output in a cache key, such as a date or software version:

# http://man7.org/linux/man-pages/man1/date.1.html
- name: Get Date
  id: get-date
  run: echo "date=$(/bin/date -u "+%Y%m%d")" >> $GITHUB_OUTPUT
  shell: bash

- uses: nix-community/cache-nix-action@v5
  with:
    primary-key: ${{ runner.os }}-${{ steps.get-date.outputs.date }}-${{ hashFiles('**/lockfiles') }}
    paths: path/to/dependencies

See Using contexts to create cache keys

Cache Limits

A repository can have up to 10GB of caches. Once the 10GB limit is reached, older caches will be evicted based on when the cache was last accessed. Caches that are not accessed within the last week will also be evicted.

Skipping steps based on cache hit

Using the hit-primary-key output, subsequent steps (such as install or build) can be skipped when a cache hit occurs on the primary key. It is recommended to install missing/updated dependencies in case of a partial key match when the key is dependent on the hash of the package file.

Example:

steps:
  - uses: actions/checkout@v4

  - uses: nix-community/cache-nix-action@v5
    id: cache
    with:
      primary-key: ${{ runner.os }}-${{ hashFiles('**/lockfiles') }}
      path: path/to/dependencies

  - name: Install Dependencies
    if: steps.cache.outputs.hit-primary-key != true
    run: /install.sh

Note The id defined in nix-community/cache-nix-action must match the [id] in the if statement (i.e. steps.[id].outputs.hit-primary-key)

Cache Version

Cache version is a hash generated for a combination of compression tool used (Gzip, Zstd, etc. based on the runner OS) and the path of directories being cached. If two caches have different versions, they are identified as unique caches while matching. This, for example, means that a cache created on a windows-latest runner can't be restored on ubuntu-latest as cache Versions are different.

Pro tip: The list caches API can be used to get the version of a cache. This can be helpful to troubleshoot cache miss due to version.

Example The workflow will create 3 unique caches with same keys. `Linux` and `Windows` runners will use different compression technique and hence create two different caches. And `build-linux` will create two different caches as the `paths` are different. ```yaml jobs: build-linux: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Cache Primes id: cache-primes uses: nix-community/cache-nix-action@v5 with: primary-key: primes paths: prime-numbers - name: Generate Prime Numbers if: steps.cache-primes.outputs.cache-hit != 'true' run: ./generate-primes.sh -d prime-numbers - name: Cache Numbers id: cache-numbers uses: nix-community/cache-nix-action@v5 with: primary-key: primes paths: numbers - name: Generate Numbers if: steps.cache-numbers.outputs.cache-hit != 'true' run: ./generate-primes.sh -d numbers build-windows: runs-on: windows-latest steps: - uses: actions/checkout@v4 - name: Cache Primes id: cache-primes uses: nix-community/cache-nix-action@v5 with: primary-key: primes paths: prime-numbers - name: Generate Prime Numbers if: steps.cache-primes.outputs.cache-hit != 'true' run: ./generate-primes -d prime-numbers ```

Known practices and workarounds

There are a number of community practices/workarounds to fulfill specific requirements. You may choose to use them if they suit your use case. Note these are not necessarily the only solution or even a recommended solution.

Cache segment restore timeout

A cache gets downloaded in multiple segments of fixed sizes (1GB for a 32-bit runner and 2GB for a 64-bit runner). Sometimes, a segment download gets stuck which causes the workflow job to be stuck forever and fail. Version v3.0.8 of actions/cache introduces a segment download timeout. The segment download timeout will allow the segment download to get aborted and hence allow the job to proceed with a cache miss.

Default value of this timeout is 10 minutes and can be customized by specifying an environment variable named SEGMENT_DOWNLOAD_TIMEOUT_MINS with timeout value in minutes.

Update a cache

A cache today is immutable and cannot be updated. But some use cases require the cache to be saved even though there was a hit during the Restore phase. To do so, always purge old versions of that cache:

- name: update cache on every commit
  uses: actions/cache@v4
  with:
    primary-key: primes-${{ runner.os }}
    paths: prime-numbers
    purge: true
    purge-primary-key: always

Please note that this will create a new cache on every run and hence will consume the cache quota.

Use cache across feature branches

Reusing cache across feature branches is not allowed today to provide cache isolation. However if both feature branches are from the default branch, a good way to achieve this is to ensure that the default branch has a cache. This cache will then be consumable by both feature branches.

Force deletion of caches overriding default cache eviction policy

Caches have branch scope restriction in place. This means that if caches for a specific branch are using a lot of storage quota, it may result into more frequently used caches from default branch getting thrashed. For example, if there are many pull requests happening on a repo and are creating caches, these cannot be used in default branch scope but will still occupy a lot of space till they get cleaned up by eviction policy. But sometime we want to clean them up on a faster cadence so as to ensure default branch is not thrashing. In order to achieve this, gh-actions-cache cli can be used to delete caches for specific branches.

This workflow uses gh-actions-cache to delete all the caches created by a branch.

Example ```yaml name: cleanup caches by a branch on: pull_request: types: - closed workflow_dispatch: jobs: cleanup: runs-on: ubuntu-latest permissions: # `actions:write` permission is required to delete caches # See also: https://docs.github.com/en/rest/actions/cache?apiVersion=2022-11-28#delete-a-github-actions-cache-for-a-repository-using-a-cache-id actions: write contents: read steps: - name: Check out code uses: actions/checkout@v4 - name: Cleanup run: | gh extension install actions/gh-actions-cache REPO=${{ github.repository }} BRANCH=refs/pull/${{ github.event.pull_request.number }}/merge echo "Fetching list of cache key" cacheKeysForPR=$(gh actions-cache list -R $REPO -B $BRANCH | cut -f 1 ) ## Setting this to not fail the workflow while deleting cache keys. set +e echo "Deleting caches..." for cacheKey in $cacheKeysForPR do gh actions-cache delete $cacheKey -R $REPO -B $BRANCH --confirm done echo "Done" env: GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} ```

License

The scripts and documentation in this project are released under the MIT License