grafana / shared-workflows

A public-facing, centralized place to store reusable workflows used by Grafana Labs.
GNU Affero General Public License v3.0
9 stars 15 forks source link

trigger-argo-workflow fails to run because go is not available #482

Closed mem closed 14 hours ago

mem commented 1 month ago

using this job in a workflow:

  release:
    name: release
    runs-on: ubuntu-amd64
    steps:
      - name: Trigger argo workflow
        uses: grafana/shared-workflows/actions/trigger-argo-workflow@f0dd3480fa3e657d741dd9e8d9b999cfb61fc713
        with:
          namespace: ...
          workflow_template: ...
          extra_args: ...
          parameters: |
            ...

(removed the arguments because they are not relevant for the issue)

Fails with:

2024-10-12T02:41:31.6165536Z ##[group]Run ./_shared-workflows-trigger-argo-workflow/actions/setup-argo
2024-10-12T02:41:31.6165902Z with:
2024-10-12T02:41:31.6166075Z   cache-prefix: argo
2024-10-12T02:41:31.6166366Z   version: 3.5.1
2024-10-12T02:41:31.6166555Z ##[endgroup]
2024-10-12T02:41:31.6193804Z ##[group]Run actions/cache@3624ceb22c1c5a301c8db4169662070a689d9ea8
2024-10-12T02:41:31.6194125Z with:
2024-10-12T02:41:31.6194370Z   path: /opt/actions-runner/_work/sm-alerts/sm-alerts/bin/argo
2024-10-12T02:41:31.6194685Z   key: argo-Linux-X64-3.5.1
2024-10-12T02:41:31.6194907Z   enableCrossOsArchive: false
2024-10-12T02:41:31.6195132Z   fail-on-cache-miss: false
2024-10-12T02:41:31.6195350Z   lookup-only: false
2024-10-12T02:41:31.6195534Z   save-always: false
2024-10-12T02:41:31.6195713Z ##[endgroup]
2024-10-12T02:41:31.8386196Z Cache not found for input keys: argo-Linux-X64-3.5.1
2024-10-12T02:41:31.8436013Z ##[group]Run echo "OS=$(go env GOOS)" >> "$GITHUB_ENV"
2024-10-12T02:41:31.8436384Z echo "OS=$(go env GOOS)" >> "$GITHUB_ENV"               <--------------------------------------
2024-10-12T02:41:31.8436724Z echo "ARCH=$(go env GOARCH)" >> "$GITHUB_ENV"
2024-10-12T02:41:31.8451400Z shell: /usr/bin/sh -e {0}
2024-10-12T02:41:31.8451651Z ##[endgroup]
2024-10-12T02:41:31.8477034Z /opt/actions-runner/_work/_temp/505dabba-dce4-4b60-a243-e887d170d08f.sh: 1: go: not found <-----------------
2024-10-12T02:41:31.8478851Z /opt/actions-runner/_work/_temp/505dabba-dce4-4b60-a243-e887d170d08f.sh: 2: go: not found

You can see in the lines I'm pointing to that it's trying to run go via $(go env GOOS) and failing with go: not found.

This is likely because the action is calling setup-argo before calling setup-go.

Since the only reason why the action needs to install Go is because it needs the GOOS and GOARCH values, this should be a straightforward fix.

zerok commented 2 weeks ago

Also running into this. Will take a look 🙂

zerok commented 6 days ago

This should be fixed once the new images are available :) I'll keep this issue updated.

mem commented 3 days ago

This should be fixed once the new images are available :) I'll keep this issue updated.

Unless I'm missing something, new images are not actually a fix.

If I'm following correctly, the assumption is that since the new images have the go binary available in them, this shouldn't be an issue anymore? Is that right?

That is making the assumption that the action is running directly in that filesystem.

The bit I'm not understanding is why fixing the obvious bug is such a problem. Can anyone clarify?

iainlane commented 14 hours ago

We're interested in optimising for known runners - GitHub's ones or Grafana Labs ones. Both of these guarantee that go is available (notwithstanding the - now fixed - issue where it went missing on one of our runners). Since we don't care about which version of go it is, that is enough for us. In those environments, it's unnecessary to spend time (could be money) downloading a new go binary.

I would accept a change, if you want to provide it @mem, to first check for go in the PATH and run setup-go only if it's not present. But in the absence of that, this will be a "won't do" from us. I hope this makes sense, even if you don't agree.

mem commented 13 hours ago

I would accept a change, if you want to provide it @mem, to first check for go in the PATH and run setup-go only if it's not present. But in the absence of that, this will be a "won't do" from us. I hope this makes sense, even if you don't agree.

No, @iainlane, it doesn't make sense.

This is the existing code in the action:

    - name: Setup argo
      uses: ./_shared-workflows-trigger-argo-workflow/actions/setup-argo

    - name: Setup go
      uses: actions/setup-go@41dfa10bad2bb2ae585af6ee5bb4d7d973ad74ed # v5.1.0
      with:
        check-latest: true
        cache-dependency-path: |
          actions/trigger-argo-workflow/go.sum
        go-version-file: "_shared-workflows-trigger-argo-workflow/actions/trigger-argo-workflow/go.mod"

This is unconditionally running thru the setup-go action, presumably because the core logic of the action is written in Go:

    - name: Run
      id: run
      shell: bash
      run: |
        ...
        go run github.com/grafana/shared-workflows/actions/trigger-argo-workflow/cmd/trigger-argo-workflow \
        ...

By your reasoning, that Setup go step shouldn't be there because there's an assumption that the go command is already available in the environment where this is running.

The change that I did submit in the past and that got nowhere was changing the above code to this:

    - name: Setup go
      uses: actions/setup-go@41dfa10bad2bb2ae585af6ee5bb4d7d973ad74ed # v5.1.0
      with:
        check-latest: true
        cache-dependency-path: |
          actions/trigger-argo-workflow/go.sum
        go-version-file: "_shared-workflows-trigger-argo-workflow/actions/trigger-argo-workflow/go.mod"

    - name: Setup argo
      uses: ./_shared-workflows-trigger-argo-workflow/actions/setup-argo

The reasoning was that ./_shared-workflows-trigger-argo-workflow/actions/setup-argo does make use of the go command just like the run step does. In other words, if it makes sense to Setup go because the Run step uses go, it also makes sense to Setup go because Setup argo uses it.

I will not push back against the assumption that these actions are used in a specific environment (i.e. they are not meant to be used outside of that) but if that's an assumption that you are making, it should say so on the label. Please note that this implies that they are not meant to be used in workflows running in containers (because those containers might break the implicit assumptions in these actions).

iainlane commented 12 hours ago

You have missed that there are two actions involved here: setup-argo and trigger-argo-workflow.

The former is the one that makes the assumption about go being installed (it also assumes that a shell, gunzip, chmod and doubtless other tools are installed; perhaps we should figure out how to fetch those too), and it is the one for which Go is an implementation detail.

The latter does not make that assumption, because the tool itself is written in Go, is built during the action's execution, and therefore needs to install the toolchain.

mem commented 11 hours ago

So you are saying that setup-argo can assume go is installed and it's free to use it but trigger-argo-workflow should not assume it's installed and therefore it should install it?

To me that's an argument for setup-argo to install go, too, because go is not part of anything that can be reasonably assumed to be available (like gzip, chmod, etc, which are part of the base system of any reasonable linux distribution). You are saying that the environment where this action assumes to be running in does have that available and therefore it does not need to do that.

At that point, your argument becomes that trigger-argo-workflow requires a specific Go version. But the counter to that is that the assumed-to-be-installed go toolchain will take care of it. It's not a mechanism that I particularly like, but it's something that you can assume to be available (because Go 1.21, where this was introduced, is no longer receiving support and therefore you shouldn't be assuming that you have to operate with it).

What I'm trying to highlight here is that the assumptions you are making are not simple ones, as it might appear as first glance. You could have avoided the whole thing by adding a simple comment to the code: "I'm assuming that..."

iainlane commented 10 hours ago

Thanks for the input.

I'm locking this issue now. I don't think there's anything further to be gained from the discussion.