Rerun pipeline automatically if it fails

afeefghannam89 commented 1 year ago

We should see the ability to re-run failed full stack pipelines automatically if they are not ended successfully. The retry times is also important.

widhalmt commented 1 year ago

Honestly, I didn't know that was possible. It would be a great addition nowadays. Especially if we could have some time between failing and rerunning.

Donien commented 1 year ago

From what I could find (and according to this blog post) our only options seem to be

Using the same steps within a job multiple times, running the duplicates only on failure of the original (no timeout possible apart from a sleep command)
Using e.g. nick-field's retry, which only seems capable of rerunning commands (run: some-command) but no other uses:

The latter offers timeout_minutes and max_attempts which would pretty much achieve what we are trying to do.

So in .github/workflows/test_full_stack.yml we could do something like this:

...
      - name: Test with molecule
        uses: nick-fields/retry@v2
        with:
          timeout_minutes: 10
          max_attempts: 3
          command: molecule test -s ${{ matrix.scenario }}
        env:
          MOLECULE_DISTRO: ${{ matrix.distro }}
          PY_COLORS: '1' 
          ANSIBLE_FORCE_COLOR: '1' 
          ELASTIC_RELEASE: ${{ matrix.release }}
...

Going for the first option, it could look like this:

...
      - name: Test with molecule
        id: firsttry
        continue-on-error: true
        run: |
          molecule test -s ${{ matrix.scenario }}
        env:
          MOLECULE_DISTRO: ${{ matrix.distro }}
          PY_COLORS: '1'
          ANSIBLE_FORCE_COLOR: '1'
          ELASTIC_RELEASE: ${{ matrix.release }}

      - name: Test with molecule
        id: secondtry
        if: steps.firsttry.outcome == 'failure'
        run: |
          sleep 60
          molecule test -s ${{ matrix.scenario }}
        env:
          MOLECULE_DISTRO: ${{ matrix.distro }}
          PY_COLORS: '1'
          ANSIBLE_FORCE_COLOR: '1'
          ELASTIC_RELEASE: ${{ matrix.release }}
...

Depending on the amount of retries we desire we'd need to add as many steps to the job.

Rerunning a whole workflow does not seem possible at the moment.

NETWAYS / ansible-collection-elasticstack

Rerun pipeline automatically if it fails #195