actions / setup-go

Set up your GitHub Actions workflow with a specific version of Go
MIT License
1.4k stars 516 forks source link

feat: add cache prefix #358

Open ernado opened 1 year ago

ernado commented 1 year ago

Description: Add cache key prefix, like that:

- name: Install Go
  uses: actions/setup-go@v4
  with:
    go-version: '1.20'
    cache: true
    cache-prefix: 'test-${{ matrix.arch }}-${{ matrix.flags }}-'

Justification: Cache from multiple workflows can be mixed. For example, there can be a workflow that only downloads dependencies. This will produce cache without .cache/go-build, which will be used by another workflows, effectively disabling build cache.

This was the cause in #357

Are you willing to submit a PR?

I don't know typescript well, but I'm learning.

Workaround

Use actions/cache@v3 manually, something like that:

name: manual

on:
  push:
    branches: [main]

jobs:
  build:
    runs-on: ubuntu-latest
    env:
      cache_name: manual

    steps:
      - uses: actions/checkout@v3

      - uses: actions/setup-go@v4
        with:
          go-version: "1.20.x"
          cache: false

      - name: Get Go environment
        run: |
          echo "cache=$(go env GOCACHE)" >> $GITHUB_ENV
          echo "modcache=$(go env GOMODCACHE)" >> $GITHUB_ENV

      - name: Set up cache
        uses: actions/cache@v3
        with:
          path: |
            ${{ env.cache }}
            ${{ env.modcache }}
          key: ${{ env.cache_name }}-${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}-${{ hashFiles('**/go.mod') }}
          restore-keys: |
            ${{ env.cache_name }}-${{ runner.os }}-go-

      - run: go env GOCACHE
      - run: go build -v

Cons Cache configuration options can possibly lead to feature bloat, so feel free to decline this feature proposal.

marko-zivic-93 commented 1 year ago

Hello @ernado Thank you for sharing this feature request with us. We will take a look at it, and come back to you as soon as we have some info 🙂

erezrokah commented 1 year ago

Hi 👋 Are you open for a contribution for this issue? We're experiencing it as well and using the suggested workaround (using actions/cache).

erezrokah commented 1 year ago

Opened https://github.com/actions/setup-go/pull/366, feedback appreciated

lzap commented 1 year ago

This could help to workaround the problem we have on our project.

andig commented 1 year ago

I've recently noticed a fair bit of slow-down in our parallel builds from 2:30 to 3:30, see https://github.com/evcc-io/evcc/actions/runs/5124962244 for an example. It shows that both build and test steps- despite stable dependencies- will download a fair bit of packages that should already be there:

fatal: No names found, cannot describe anything.
Version: 8acee30 8acee30 2023-05-30_18:24:01
CGO_ENABLED=0 go build -v -tags=release -ldflags='-X github.com/evcc-io/evcc/server.Version=8acee30 -X github.com/evcc-io/evcc/server.Commit=8acee30 -s -w'
go: downloading github.com/Masterminds/sprig/v3 v3.2.3
go: downloading github.com/andig/gosunspec v0.0.0-20211108155140-af2e73b86e71
go: downloading github.com/bogosj/tesla v1.1.1-0.20230430222423-0d6e63ee90c9
...

It seems likely that this might be a conflict with the parallel clean job that just installs the tools and does some basic tests:

Run make install
fatal: No names found, cannot describe anything.
go install $(go list -f '{{join .Imports " "}}' tools.go)
go: downloading github.com/golang/mock v1.6.0
go: downloading github.com/dmarkham/enumer v1.5.8
go: downloading github.com/pascaldekloe/name v1.0.1
go: downloading golang.org/x/tools v0.9.1
go: downloading golang.org/x/mod v0.10.0
go: downloading golang.org/x/sys v0.8.0

It seems that the situation described in this issue might be quite common.

andig commented 1 year ago

And here is the result applying this issue's PR: https://github.com/evcc-io/evcc/actions/runs/5126503582. Build time reduced from 3min to 1min at the cost of 3 separate caches. Worth the efforts.

Any chance to get this reviewed/merged?

dmitry-shibanov commented 1 year ago

Hello everyone. We would not like to implement such kind of logic, because the initial caching logic for setup-go should cover the most often use cases. If you need more flexibility with cache, we would like to recommend actions/cache. It's relevant to setup-go ADR.

For now I'm going to close the issue.

erezrokah commented 1 year ago

Hi @dmitry-shibanov and thanks for the comment. Are you open to adding docs for the cases the caching logic in setup-go doesn't cover? It does seem like a gotcha that should be documented (based on the backlinks to this issue) as the caching is now enabled by default.

Something in the lines of:

Please note that if you use setup-go from multiple jobs (e.g. build and test) you'll need to disable caching and use actions/cache, otherwise caching will not work as expected

ernado commented 1 year ago

Agree with @erezrokah, adding this to docs (and probably providing a reference workaround + link this issue) would be very helpful.

Also agree with @dmitry-shibanov that this feature should not be implemented, because it is hard to cover all use-cases, so just let user to handle caches manually (like we did in our projects).

andig commented 1 year ago

Plus this needs to cover which folders to cache for Go best practices.

We would not like to implement such kind of logic, because the initial caching logic for setup-go should cover the most often use cases.

This is really unfortunate. The additional logic is minimal and the use is large. It seems quite wide-spread to have build, test, lint jobs parallellized and in all these instances you'll need to separate cache keys.

because it is hard to cover all use-cases, so just let user to handle caches manually (like we did in our projects).

I disagree. The case covered is in the users hands by chosing the right key.

erezrokah commented 1 year ago

Plus this needs to cover which folders to cache for Go best practices.

And also explain to use the Go version from setup-go outputs in the cache key

andig commented 1 year ago

@erezrokah would you potentially consider rebasing your PR and keeping your fork going forward?

erezrokah commented 1 year ago

@erezrokah would you potentially consider rebasing your PR and keeping your fork going forward?

I'm not sure that would the best solution. If there are docs that users can copy paste and get the same functionality then my PR would be mostly nice to have I think, and won't justify maintaining a fork

andig commented 1 year ago

@dsame thank you for reopening as investigation. Is there anything we could contribute?

lucacome commented 1 year ago

I removed actions/cache from my workflows when the caching was first implemented and started noticing that it didn't seem to use the cache anymore https://github.com/actions/setup-go/issues/130#issuecomment-1207103093

We don't pursue the goal to provide wide customization of caching in scope of actions/setup-go action. The purpose of this integration is covering ~90% of basic use-cases.

I'm not sure how you arrived at that "~90% of basic use-cases" but it seems unlikely to me that 90% of projects (if that's what it means) don't have concurrent jobs running in a pipeline. And since caching now is enabled by default it means it won't work by default. I don't think we're asking for fine-tuning of the caching mechanism here, but just for a way to make the caching work in even more cases.

Whatever that 10% is, still doesn't seem an irrelevant number especially when it can be fixed by a 2-line PR

dsame commented 1 year ago

Hello @andig, @ernado

The issue relates to the feature request for optional build directory caching https://github.com/actions/setup-actions-team/issues/26 and should be resolved with it.

The advantage of (optional) skipping build directory at all seems to be more preferable than having key-prefix because it allows to have multiple build steps in the one job instead of multiple jobs installing the same go version just for sake of getting different caches having the almost same content due to the common dependencies.

Although there might be a case where caching build directory brings significant improving over skipping it. At the moment there's no any discovered, but it would be useful know different opinions to make a final decision

Adding the cache prefix brings significant drawback of using log of storage with multiple caches and hitting the limit very soon, making the caching impossible at all.

Thanks in advance for sharing your thoughts.

andig commented 1 year ago

Seems I can't follow the link. Would like to read up on build directory caching to understand the proposal better.

candiduslynx commented 1 year ago

@dsame

Although there might be a case where caching build directory brings significant improving over skipping it. At the moment there's no any discovered, but it would be useful know different opinions to make a final decision

We want to be able to cache test results as in some cases clean testing would run for ~10 minutes, whilst the cached results will be available in seconds.

andig commented 1 year ago

I'm still failing to understand the proposal of "optional build directory caching". The link in https://github.com/actions/setup-go/issues/358#issuecomment-1674545346 does not work.

We want to be able to cache test results as in some cases clean testing would run for ~10 minutes

We're also caching:

We also have different build steps like generating assets, updating docs and checking repo is in "clean" state that are all parallelised. Some of these steps use Go-ased tools but would not necessarily download modules or fully populate the build cache. Using any of these steps for heavier tasks like build or test has ended up with cache hits on half-populated caches.

erezrokah commented 1 year ago

multiple build steps in the one job instead of multiple jobs

Thanks for sharing @dsame ❤️ Can you maybe share a before/after workflow example with the optional build step so we can understand the proposal better?

Won't that mean that the different steps run in serial as opposed to parallel build jobs? A cache prefix tries to solve using setup-go in parallel build jobs with caching enabled. I'm not sure how putting all steps in a single job can allow running those in parallel.

Please note that at least in our case we also have completely different workflows (not only jobs) that run in parallel, and each workflow needs a distinct cache (for example, one used for testing invoked on pull request triggers, and another used for releasing artifacts invoked on tag push triggers).

Adding the cache prefix brings significant drawback of using log of storage with multiple caches and hitting the limit very soon, making the caching impossible at all.

In our case we treat managing cache limits as a distinct problem to solve (especially as we have a monorepo with many components, each one with a different cache), unrelated to the requirement to have separate caches for jobs.

There are official guides (see https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows#force-deleting-cache-entries) on how to manage cache limits, and it might be better to leave the decision on how to manage cache limits to users.

Regardless of cache limits and this issue, I would prefer to have a cache miss than a cache hit on a cache created from an unrelated job, as at least a cache miss will create consistent results.

dsame commented 1 year ago

@andig the complex cases like the one you have described should use @actions/actions-cache , see the samples as starting point

andig commented 11 months ago

@dsame after looking at the samples I still see this an ill-advised tradeoff. We require people to think about cache keys, restore keys, pathes, hashfiles etc instead of providing them the absolutely simple and straight-forward capability of separating their caches per prefix which is a single-line PR.

More so, with any change in actions/cache or the way how Go dependencies should be cached, we would require everyone to update their manual cache settings once more. The fact that even documenting the advanced cases has been ongoing for six months now indicates that this is not straight-forward to get right.

I can only suggest to rethink this decision. I do not object to adding additional docs for advanced use cases, but it would be super handy if usage for semi-simple cases could be simplified.

Update

Let me add one example. https://github.com/actions/setup-go/commit/a4d10f0ea40982c8b7151dd9d5434b1a090318cc added the OS to the cache key. Everyone using actions/cache would need to follow this best practice manually. It should also make the docs for the advanced use cases. Adding the prefix would implement this for everyone at zero cost.

andig commented 11 months ago

Friendly ping @dsame would appreciate a comment if these concerns are valid?

jsirianni commented 11 months ago

I think it would be great to have a little more control over the setup-go cache options. I was surprised to see that several of my branches share the same cache, for example.

macie commented 10 months ago

I have similar issue as @jsirianni - different branches share the same cache. When my feature branch starts using first external module, go.sum file was created and cache misses starts to appear on other branches (because they don't have external dependencies/go.sum).

From my (user of setup-go) perspective, cache integration covers only one case: trunk-based development with sequential (non-parallel) jobs. But it isn't clear from the beginning.

GitHub encourages working on feature branches. Even in this repo nearly all pull requests come from feature branches. So I share @lucacome opinion, that current defaults may be unsuitable for far more than 10% users of setup-go.

relaxdiego commented 7 months ago

For those that don't want to deal with the unnecessary complexity of actions/cache, you can use this as a workaround:

  job_a:
    steps:
      - uses: actions/setup-go@v5
        with:
          cache-dependency-path: |
            "${{ env.WORKING_DIR }}/go.sum"
            .github/.cache/buster-for-job-a

  job_b:
    steps:
      - uses: actions/setup-go@v5
        with:
          cache-dependency-path: |
            "${{ env.WORKING_DIR }}/go.sum"
            .github/.cache/buster-for-job-b

Notice the additional files in cache-dependency-path named .github/.cache/buster-for-job-a and .github/.cache/buster-for-job-b. Make sure those files have totally different contents. They can contain anything as long as they're different. The effect of this is that each job's cache key will have a unique hash suffix.

Another "feature" that this workaround provides is that you can force a rebuild of the job's cache by modifying its respective "cache buster" file.

andig commented 6 months ago

Worth mentioning that https://github.com/actions/setup-go/pull/416 is still open almost 6 months later. Splitting caches by build remains a complex task.