bitrise-steplib / steps-virtual-device-testing-for-android

MIT License
22 stars 23 forks source link

if you retry flaky tests and the last attempt succeeded, the step should succeed #95

Closed benbitrise closed 4 months ago

benbitrise commented 4 months ago

Checklist

Version

Requires a MAJOR/MINOR/PATCH version update

Context

Changes

My first ever go code, so be gentle Based on feedback from a customer (and I agree), if you have flaky retries set and the final attempt succeeded, the step should be marked as success. Also introduced some unit tests into the project

Investigation details

Decisions

benbitrise commented 4 months ago

Manual tests -

Multiple Devices, Retries Enabled

benbitrise commented 4 months ago

E2E Tests for latest iteration:

Multiple Devices, Retries Enabled both devices pass without retryboth devices pass after retryboth devices fail after retryboth devices pass, one retriesone device fails, the other succeeds

Only One Device, Retries Enabled failuresuccess on first attemptsuccess on retry

Only One Device, No Retries successfailure

Multiple Devices, No Retries both succeedone succeeds one failsboth fail

benbitrise commented 4 months ago

@tothszabi - I can't reply to a comment you made due to changes I pushed up, so starting a new thread

This makes sense if the assumption is true. I am not fully aware what you can do on Android but because Apple and Google copy ideas of each other I want to highlight that Xcode has a test running mode called Up until maximum repetitions. In this mode it will retry the test as many times as the user requested and even the successful ones. In this mode the result of the last execution counts. Can you double check that this is not possible here?

These are different levels of abstraction. I ran some tests in FTL with an iOS project to confirm.

I ran flaky test with retries on. The test failed and then passed in the same FTL Step. Firebase set the outcome to FAILURE (which I'm not convinced is the right way to handle this). Keep in mind that this is different from Firebase's retry mechanism. They note:

  • The entire test execution runs again when a failure is detected. There’s no support for retrying only failed test cases.

So in the case of the xcodebuild parameter, if retry is needed, the retry happens for only the affected test case within the Step, and FTL sets outcome to failure. But when setting the FTL retry, it does an entire new rerun of all the tests (a new Step).