actions / setup-go

Set up your GitHub Actions workflow with a specific version of Go
MIT License
1.35k stars 499 forks source link

Cache `go install`-ed binaries #483

Open SirSova opened 1 month ago

SirSova commented 1 month ago

Description: Cache go install-ed binaries (optionally I suppose) along with go mod dependencies. Store $GOBIN folder with all installed during workflow execution binaries on post step (cache save).

Justification: In my scenario, I use tparse tool to prettify tests results. I can imagine other cases such as code generator tools. Basically pre-run/post-run scripts. For now, I turned off the cache option of this action and wrote my own using action/cache, but it adds significant complexity to keep it around multiple workflows the same way.

Are you willing to submit a PR? Sure, as soon as the feature is approved.

HarithaVattikuti commented 1 month ago

Hello @SirSova We appreciate your suggestion for a new feature! We'll make sure to address it when we have the opportunity

silverwind commented 4 weeks ago

I don't know what GOBIN is, but GOMODCACHE (go env GOMODCACHE aka. $GOPATH/pkg/mod) is generally cachable and caching it would speed up any go run tool@version or go install tool@version invokations, so it would be welcome to include them in the caching.

SirSova commented 3 weeks ago

GOBIN represents $(go env GOPATH)/bin. It's an env for the folder with all go-installed binaries.

So if I run go install tool@version -- it won't add a new dependency in my go.mod (meaning it won't be cached), but inside my GH workflows I do this:

go install github.com/mfridman/tparse@vX.Y.Z
go test -json  ./... | tparse -all

It will download and build tparse tool on each run which I want to avoid. And since these installations managed by Go, I believe that it's appropriate to do using setup-go action

silverwind commented 3 weeks ago

I have been experimenting using https://github.com/actions/cache and got some good performance results by caching GOCACHE (build cache) and GOMODCACHE (modules cache) but I see it as a bit of risky activity because it relies on golang correctly invalidating its cache and I'm not fully trusting it yet.

StefMa commented 3 weeks ago

Even through I would love to have this feature build in into this action, I honestly think it's not the responsibility of this action to cache such things.

This action is designed to install go, not more. What you're doing with go is not really part of this action. Is it? 🤔

tak11173132 commented 3 weeks ago

I don't know what GOBIN is, but GOMODCACHE (go env GOMODCACHE aka. $GOPATH/pkg/mod) is generally cachable and caching it would speed up any go run tool@version or go install tool@version invokations, so it would be welcome to include them in the caching.

silverwind commented 3 weeks ago

Even through I would love to have this feature build in into this action, I honestly think it's not the responsibility of this action to cache such things.

This action is designed to install go, not more. What you're doing with go is not really part of this action. Is it? 🤔

I tend to agree that caching should not be in scope of setup-* actions (do one thing), but apparently these caching features have been creeping into them and setup-go is as far as I'm aware the only setup action that enables caching by default.

I think the most important thing is that only safe things should be cached and I don't know how safe it is to cache these go directories. There could always be undiscovered cache invalidation bugs in golang.

nferch commented 2 weeks ago

I too would appreciate this feature. Even if it just cached the dependencies for something that was go install'd, that'd speed up my builds quite a bit.

I'm a bit confused/ignorant as to why this isn't happening already. I have multiple workflows that run at the same time on the same commit, is it that the first run that completes doesn't contain the cached modules in GOMODCACHE?

zaibon commented 1 week ago

I have been experimenting using https://github.com/actions/cache and got some good performance results by caching GOCACHE (build cache) and GOMODCACHE (modules cache) but I see it as a bit of risky activity because it relies on golang correctly invalidating its cache and I'm not fully trusting it yet.

@silverwind can you share your solution in the mean time, while this issue is being decided/worked on ?

silverwind commented 1 week ago

Here is what I have been experimenting with and it seemed to work. The cache key surely is too aggressive and GOVERSION and go.mod hash can likely be removed.

- uses: actions/setup-go@v5
  with:
    go-version-file: go.mod
    check-latest: true
- id: vars
  run: |
    echo "GOCACHE=$(go env GOCACHE)" >> "$GITHUB_OUTPUT"
    echo "GOMODCACHE=$(go env GOMODCACHE)" >> "$GITHUB_OUTPUT"
    echo "GOVERSION=$(go env GOVERSION)" >> "$GITHUB_OUTPUT"
- uses: actions/cache/restore@v4
  with:
    path: |
      ${{ steps.vars.outputs.GOCACHE }}
      ${{ steps.vars.outputs.GOMODCACHE }}
    key: golint-v1-${{ github.job }}-${{ runner.os }}-${{ runner.arch }}-${{ steps.vars.outputs.GOVERSION }}-${{ hashFiles('go.mod') }}
- run: make lint
- uses: actions/cache/save@v4
  with:
    path: |
      ${{ steps.vars.outputs.GOCACHE }}
      ${{ steps.vars.outputs.GOMODCACHE }}
    key: golint-v1-${{ github.job }}-${{ runner.os }}-${{ runner.arch }}-${{ steps.vars.outputs.GOVERSION }}-${{ hashFiles('go.mod') }}
nferch commented 1 week ago

@silverwind thanks for sharing!

That seems to work for me, although the actions/cache/restore generates warnings when it tries to overwrite files that the actions/setup-go action restored from its cache.

I was able to achieve similar results by adding my Makefile to the cache-dependency-path value, which has two effects:

peterbourgon commented 1 week ago

Running

go install github.com/mfridman/tparse@v0.14.0

will create a binary named tparse in $GOPATH/bin (if $GOPATH is set) or in $HOME/go/bin (if $GOPATH is not set). That binary will be tparse at version v0.14.0, but that version is not represented in the binary filename, or in anything else that can be reasonably captured by a cache key. So you can't really cache $GOPATH/bin (or $HOME/go/bin), at least not effectively.

zaibon commented 1 week ago

will create a binary named tparse in $GOPATH/bin (if $GOPATH is set) or in $HOME/go/bin (if $GOPATH is not set). That binary will be tparse at version v0.14.0, but that version is not represented in the binary filename, or in anything else that can be reasonably captured by a cache key. So you can't really cache $GOPATH/bin (or $HOME/go/bin), at least not effectively.

The trick there is to allow the user to define a cache key that is generated from the script that contain go install github.com/mfridman/tparse@v0.14.0. This is where the version exists, so it can be used as cache key.

SirSova commented 1 week ago

I suppose Go install verify the version of the binary using --version, checksum or somewhere stored in go mod cache, but right now just by caching $GOPATH/bin it won't download & build the binary again.

My working workflow with cache binaries:

 - name: Set up Go 1.22
        uses: actions/setup-go@v5
        with:
          go-version: '1.22'
          cache: false # we use our own cache for go modules, since setup-go cache doesn't save `~/go/bin`

      - name: Check out source code
        uses: actions/checkout@v4

      - name: Cache go modules
        uses: actions/cache@v4
        with:
          # /go/bin is for `go install`-ed tools
          path: |
            ~/.cache/go-build
            ~/go/pkg/mod
            ~/go/bin
          key: ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}
          restore-keys: |
            ${{ runner.os }}-go-

      - name: Run tests
        run: >
        go install github.com/mfridman/tparse@v0.14.0
        go test -json  ./... | tparse -all

Also good to notice. tparse itself isn't a dependency of my code, so go.mod doesn't contain any information about it. I install it manually just before the tests

P.S: I want to replace Cache go modules part with some additional config for setup-go, such as:

        uses: actions/setup-go@v5
        with:
          go-version: '1.22'
          cache: true
          cache-install: true # <-----
Zxilly commented 6 days ago

go install downloads the code, compiles it, and then puts the binary in a directory. But there is no standard way of describing the version of the program being installed, and each time go install is executed it is a brand new installation, so how should the cache key be designed? Should the cache key be designed to keep track of each go install call? I can almost visualize a big pile of ugly workaround code already.

SirSova commented 5 days ago

The first run of go install does it, but it's 100% not a brand-new installation for the next calls. Just try it out. The 2nd+ calls are almost instantaneous. It must cache at least all dependencies. The workaround described above (GH workflow) worked for me perfectly.

P.S: I see also significant difference if I use "latest" vs specific version.

silverwind commented 5 days ago

But there is no standard way of describing the version of the program being installed

Since go 1.16, you can use go install module@version and go run module@version to specify the version.

Zxilly commented 5 days ago

yes, you can specify version, but after the install no way to get that. The second install call faster because go compiler cache the middle object files, but the final binary still been created duplicate.