Closed victorb closed 6 years ago
we could probably store key/value of repository+commit+platform and status, if it's already successful, just report it's successful again and skip actually running the tests
you'll still rerun the entire thing, but already successful steps would be pretty much instant
@VictorBjelkholm I think this isn't the best path to go forward on, it doesn't scale. Let's say you have one test that fails 30% of times. It's ok, you will have to restart 1 out of 3 times(expected runs till success is 1.5), but then you have a second test like that and then 3rd maybe 4th.
In case of 4, you would expect tests to succeed after 4 tries, and when it is 7, it gets up to 12 tires.
The worst part is if you have many tests that have 1% chance of failure and one frequently failing (30%) the effect compounds.
I would suggest starting to use JUnit exporter as go-ipfs does and track tests that fail most frequently and fixing them.
Random test failures are very problematic as we don't know if CI provides good or bad conditions for them. Maybe in the real world, the conditions are worst than in our CI. Then it is a real issue, not a flaky test yet we are ignoring it.
The formula for expected tries is:
1/(P_success)
Thanks for the input @Kubuxu! Solutions mentioned here are more like workarounds rather than proper long-term solutions.
I agree that we should focus on solving the test flakyness. We have some utilities for fixing things, like retrying failing tests up to X times before actually failing the test case.
But, even with that, we can have temporary failures in one platform + version even when tests are not flaky, that a retry for that platform+version would help.
My concern is that we are naturally optimising for a "green CI" metric if something like this was to be implemented fixing flaky tests is outside of the scope of this metric.
This means those tests will linger for a very long time and cause problems.
I hear your concern but our focus is not on green CI but rather passing tests, local and on CI. But there is cases where it's not the tests fault for a failure but rather tooling or something else. This feature would be for those cases.
This issue was moved to ipfs/testing#103
If you have a job that tests on many versions and OSes, while only one of them fail, you need to retry the entire job. Instead, we should have a parameter that can used to just select one of the versions and retry just that part in a new job.