Closed AliSoftware closed 1 year ago
@mokagio I spent some time (most of my afternoon, tbh 😅 ) testing things around and improved the changes significantly:
test/failure-retry-junit-report
branch on the WPiOS repo for testing purposes:
pipeline.yml
and make it point to this fix/annotate_test_failures
branch of the a8c-ci-toolkit-buildkite-plugin
report.junit
XML file that represented a typical report for the various cases you raised, which I then used as a base to test and improve my annotate_test_failures
script here.The improvements I made include:
TestFailure
ruby class to make handling of failure nodes easier to manipulateTestFailure
is about, allowing us to distinguish the number of distinct failed tests, the number of distinct failed assertions (which can be different if a test had multiple assertion failures), and the number of times each failure was reported (for cases of retries)warning
annotation, in addition to only report true failures in the error
annotation.@mokagio Given all the changes I've made after all that intense testing and tweaking, I'm interested in your re-review! (Note: I've updated the testing instructions in the PR description)
While commenting about reporting the return code from the test task rather than the annotation command, I realized that there is nothing platform-specific that forces us to run the annotation logic in the macOS agent.
We could move it to a cheaper agent with Ruby support, but:
What?
Fixes
annotate_test_failures
command, to ignore cases of flaky retries which ultimately succeeded.This bug was the cause for some CI builds to still get an annotation mentioning test failures… even when the corresponding CI step ended up green (See this example build):
(cc @mokagio & @crazytonyli as I talked about this bug very recently with both of you, in P2s and PR comments respectively)
Why?
When a flaky test failed but Xcode is configured to auto-retry tests multiple times on failure, all the intermediate failures are recorded in the
report.junit
alongside the final success (if any). For example this is an extract of such.junit
report:The previous version of our script were not smart enough to detect that, and simply extracted all
testcase
nodes that had afailure
subnode, and built the annotation with the list of found failures from that. Which is why it erroneously included the flaky failuresHow?
To solve this, during the iteration on the
testcase[failure]
node candidates, we are now finding all the sibling nodes to thattestcase
which happens to have the sameclassname
andname
attributes, i.e. nodes reporting all the assertion failures for the same test. This list will thus include the current candidate being iterated on, but also potentially all other retries of the same test.Once we got that list of nodes, we check if the last one of those nodes ends up being a failure or success XML node:
Testing
WPiOS Demo build with ❌ failed + ⚠️ flaky tests
WPiOS Demo build with ⚠️ only flaky tests
WPiOS Demo build with ❌ only failed tests