Open herwinw opened 5 days ago
Sounds like a good idea! Instead of marking the test as passed, we could also retry the test a few times and if it still fails mark them as flaky. This is something we do at work for our (not so perfect) test suite as well. What do you think? Maybe we could show the number of tries on the spec website to have some statistics about this.
I dig it
The retry idea would change the requirements a bit. We might depend on the before/after hooks for setup/teardown, So this means we have two options.
Option 1: wrap the it
block in the NATUNRELIABLE
block:
NATUNRELIABLE 'occasional failure in CI', exception: Errno::EAGAIN, message: 'Resource temporarily unavailable' do
it "reads data from the connection" do
# Actual test
end
end
Option 2: embed the information in the `it block:
it "reads data from the connection", NATUNRELIABLE: true, exception: Errno::EAGAIN, message: 'Resource temporarily unavailable' do
# Actual test
end
# Or alternatively: replace the it block
NATUNRELIABLE "reads data from the connection", exception: Errno::EAGAIN, message: 'Resource temporarily unavailable' do
# Actual test
end
I would guess the second option is easier to implement (no need to check if we're in the right scope to allow a NATUNRELIABLE block), and it keeps the indentation of the file closer to upstream, so it's easier to check for differences.
I like the second option more because it makes it clear that it only affects a single test. The alternative to replace the it
method has some disadvantages, for example you cannot use fit
or you can't grep for tests using it "..."
. Maybe the API could be something like:
it "reads data from the connection", NATUNRELIABLE: { exception: Errno::EAGAIN, message: 'Resource temporarily unavailable' } do
# Actual test
end
Similar to
NATFIXME
, I would like to have aNATUNRELIABLE
block wrapper for the specs. This could be used in places where we accept the failures, but don't require it. The only thing we can do now is trying to disable the test, which means we never run it.For example: the output of https://github.com/natalie-lang/natalie/actions/runs/11729172599/job/32674223781:
We could wrap this spec like this:
The semantics should be similar to
NATFIXME
: catch anErrno::EAGAIN
with the messageResource temporarily unavailable
would still make the test pass, as does throwing nothing. Any other exception would still fail the test. No exception or message argument means we don't check the output at all.As a future work: it would be nice to get some statistics of these CI runs, so we could be notified if a test is passing reliably again.
(Related: #2307)