elastic / cloudbeat

Analyzing Cloud Security Posture
Other
39 stars 43 forks source link

[CI] Updatecli seems flaky #2158

Open moukoublen opened 5 months ago

moukoublen commented 5 months ago

Describe the bug UpdateCLI workflows seem flaky and fail most of the time they run.

We should investigate why and how to improve that.

~Optionally, we could consider removing the overlap between UpdateCLI (Golang Mod job) and dependabot.~

Screenshot 2024-04-24 at 3 00 33 PM
olblak commented 4 months ago

I accidentally saw this issue and couldn't resist having a look. Looking at this pipeline, https://github.com/elastic/cloudbeat/actions/runs/9058657978/job/24884700647 I noticed a failing pipeline in https://github.com/elastic/cloudbeat/blob/2d3a11e484e76d9b9e525b88c2ba7e9c49a49377/.ci/updatecli/updatecli.d/update-hermit.yml#L40

The default behavior for the shell plugin is to trigger a target "changed" if something was printed on the console output. In this pipeline the target is always flagged as "changed" which triggers a git commit and report an issue when no file need to be committed

So the target


targets:
  hermit:
    name: 'Update hermit and pre-commit packages'
    scmid: default
    kind: shell
    spec:
      command: .ci/updatecli/scripts/update-hermit.sh
      environments:
        - name: PATH
        - name: HOME

could probably be improved to

targets:
  hermit:
    name: 'Update hermit and pre-commit packages'
    scmid: default
    kind: shell
    spec:
      changedif:
        kind: file/checksum
        spec:
          files:
            - "your file to monitor"  
      command: 'bin/hermit install'
      environments:
        - name: PATH
        - name: HOME

Note that instead of using "file/checksum" it is also possible to use "exitcode"

with something like

targets:
  hermit:
    name: 'Update hermit and pre-commit packages'
    scmid: default
    kind: shell
    spec:
      changedif:
        kind: exitcode
        spec:
          failure: 1
          success: 0
          warning: 2
      command: 'bin/hermit install'
      environments:
        - name: PATH
        - name: HOME

I still behind in terms of documentation and still need to document those feature on the updatecli.io documentation

orestisfl commented 4 months ago

Thanks @olblak that's nice to know. Ending our scripts with git diff --exit-code and using

      changedif:
        kind: exitcode
        spec:
          failure: 0
          success: 1
          warning: 2

should probably work good enough for us. Just to clarify, failure here just means "nothing changed" not "pipeline will fail", right?

olblak commented 4 months ago

Just to clarify, failure here just means "nothing changed" not "pipeline will fail", right?

Reading your question make me realize that it's not clear enough. The "exitcode" interpretation is from an Updatecli pipeline point of view, not POSIX

"Failure" means something went wrong during the shell execution, like running out of disk space "Warning" means something changed during the shell execution like a file was modified. "success" means all is good, nothing changed

We have an open issue as we would like to clarify the different exit code cfr https://github.com/updatecli/updatecli/issues/233

orestisfl commented 4 months ago

So, for our use case can we not do

      changedif:
        kind: exitcode
        spec:
          failure: 1
          success: 0
          warning: 2

to mean "files changed if script's exit code is 0, nothing changed if exit code is 1"?

I am referring to this part of your comment above:

The default behavior for the shell plugin is to trigger a target "changed" if something was printed on the console output. In this pipeline the target is always flagged as "changed" which triggers a git commit and report an issue when no file need to be committed