checkly / checkly-cli

TS/JS native Monitoring as Code workflow
Apache License 2.0
64 stars 11 forks source link

feat: add support for test retries [sc-20570] #952

Closed clample closed 3 months ago

clample commented 4 months ago

I hereby confirm that I followed the code guidelines found at engineering guidelines

Affected Components

Notes for the Reviewer

This PR adds support for retrying failed tests with npx checkly test and npx checkly trigger. Users can configure retries by either passing the --retries=<numRetries> flag or by setting retries in their Checkly config file.

Check state changes

Previously the CLI tracked the state of a check run by using the CheckRunId. When MQTT/WebSocket updates were received, the CLI would look up which check it was for based on the CheckRunId.

Since we now have retries, the CheckRunId can be different with each retry of a check. In order to track the check state, this PR switches the CLI to use SequenceId. This is stable across all retries. CheckRunId can then be used to track the progress of a particular run.

In abstract-check-runner.js this means that the PR switches to looking up the sequenceId for incoming MQTT messages and using this for tracking the check state. The abstract-list.ts reporter is also updated to track check state using sequenceId.

This PR also introduces a new check state CheckStatus.RETRIED for indicating that a check is being retried. When a check is retried, we leave it in the CheckStatus.RETRIED state rather than switching it back to CheckStatus.SCHEDULING/CheckStatus.RUNNING. The lifecycle of a check that's retried will then look like: SCHEDULING -> RUNNING -> RETRIED -> FAILED/SUCCESSFUL.

Reporters

GitHub reporter

No changes are made to the GitHub reporter. It will simply show the same pass/fail data that it normally does. We could include the number of retries, but there isn't so much horizontal screen space in the GitHub UI to add another column.

Dot reporter

No changes. Will just indicate passed/failed after all retries are finished.

Json reporter

Includes the results for the last check run. Will also include the number of retries.

CI + List Reporter

Prints out info from any retry attempts and shows retrying checks in the summary. Basically following the Notion doc spec. Will add a video to the PR.

Running

The retry support depends on https://github.com/checkly/checkly-runners/pull/1877. For now, it's only possible to run locally. When running, you also need to manually set CHECKLY_CLI_VERSION=4.8.0 (the planned next version), since the backend relies on this to use the new MQTT topic format.

🟢 It's expected that the tests are failing until the runners PR is released