microsoft / azure-pipelines-tasks

Tasks for Azure Pipelines
https://aka.ms/tfbuild
MIT License
3.5k stars 2.61k forks source link

[enhancement]: Enable Cache Overwrite #20206

Open markbrockhoff opened 3 months ago

markbrockhoff commented 3 months ago

Task name

Cache

Describe your feature request here

Hi, I'm opening this issue to continue the already closed issue #18708. (If you'd prefer to re-open the old issue and discuss there, feel free to do so and close this one)

Right now it seems like the Cache@2 task doesn't have the option to overwrite the previously created cache for the same cache key. This makes it pretty much useless for the usage with monorepo build / caching tools like Turborepo or nx (without using their remote caching features). In these scenarios it's not really possible to define a good cache key as the result of the last build should always be cached for subsequent builds.

I propose to add an option "overwrite" to the Cache@2 task. By setting it to true it should overwrite the cache for the current key and scope even if it already exists. It should also create the cache for the current scope if it doesn't exist yet. This would come in handy in cases where the might already be a cache for the main branch but not the feature branch. Then the first run for the branch could use the main branches cache initially but then create it's own at the end of the run.

Example:

- task: Cache@2
      displayName: "♻️ Cache Turborepo"
      inputs:
          key: 'turbo | "$(Agent.OS)"'
          path: $(TURBO_CACHE_DIR)
          overwrite: true
DamienCassou commented 2 months ago

A workaround is to pass an always changing value as last key element (forcing the write) and use restoreKeys to fallback (allowing read in other jobs). The following was adapted from a comment of @hlavacek:

- task: Cache@2
  inputs:
    key: '"turbo" | "$(Agent.OS)" | "$(Build.SourceBranch)" | "$(Build.SourceVersion)"'
    # the keys are set up to first try the current branch or try to fetch main otherwise, to use main branch cache for feature branches
    restoreKeys: |
      turbo | "$(Agent.OS)" | "$(Build.SourceBranch)"
      turbo | "$(Agent.OS)" | refs/heads/main
      turbo | "$(Agent.OS)"
      turbo
    path: .turbo
  displayName: 'Turbo cache'

The problem of this approach is that Azure uses cache scopes which prevent cache from being reused across branches and pipelines.