pytest-dev / pytest

The pytest framework makes it easy to write small tests, yet scales to support complex functional testing
https://pytest.org
MIT License
12.12k stars 2.68k forks source link

Allow hooks to retry a single test case multiple times with fresh fixtures #12939

Open bcmills opened 1 week ago

bcmills commented 1 week ago

What's the problem this feature will solve?

Tests of APIs that rely on timer / timeout behaviors currently have to choose one (or both!) of {slow, flaky}:

I would like not to have to choose between those two: I want the test to run quickly, but to be retried automatically if the timeout turns out to be too short.

Describe the solution you'd like

Ideally, I would like implement a pytest fixture that takes on the current timeout value. Then, each other test fixture that depends on it can configure its own objects configured based on that timeout, and the test is run with those fixtures. If it passes, the test passes overall and is done. If it fails, the fixtures are torn down, a new (longer) timeout is selected, and a new set of fixtures are recreated with the new timeout value.

This process should be iterated until either the test passes, or the selected timeout exceeds a configured maximum.

In particular:

Examples of this pattern (in Go rather than Python) can be found in the Go project's net package: https://github.com/search?q=repo%3Agolang%2Fgo+%2FrunTimeSensitiveTest%5C%28%2F&type=code

Unfortunately, I don't see a way to run a pytest test a variable number of times with fresh fixtures:

Alternative Solutions

One alternative is to move all objects that depend on the configured timeout outside of pytest fixtures and into the test function itself. That works, but it severely diminishes the value of pytest fixtures for the affected test.

Another alternative is to design all objects in the hierarchy so that their timeouts can be reconfigured on-the-fly, and use a single set of fixtures for all attempts. Unfortunately, if I use any third-party libraries that may force me to rely on implementation details to monkey-patch the timeout configuration, and even that isn't always possible.

The-Compiler commented 1 week ago
  • If the test uses a long duration for the timeout, then it ends up needing to sleep for some multiple of that long duration, and the test runs reliably but is extremely slow — say, 10s for a test function that could normally complete in <10ms.

You lost me there. Why does it need to sleep? That's not how timeouts usually work, no? I don't see the difference between running a test once with a 10s timeout, vs. running it with a 1s + 2s + 3s + 4s timeout. For at least your "a connection timeout on a networking library" example, the test will finish as soon as the server answers, and I'd argue that for many other cases the first thing to attempt is to make it work that way as well (e.g. with a polling based API, you might still want to poll all 0.1s or something, but time out after, say, 50 attempts).

FWIW, there's pytest-rerunfailures that recreates fixtures, and seems to have a way to access the .execution_count on the test item.

There's various open issues around exposing an API around fixtures (#12630, #12376, ...), and what you describe in particular sounds a lot like a duplicate of #12596 to me.

bcmills commented 1 week ago

You lost me there. Why does it need to sleep? That's not how timeouts usually work, no?

This is for testing the cases where a call internal to the test intentionally does time out, not the case where the test itself exceeds its intended running time.

For at least your "a connection timeout on a networking library" example, the test will finish as soon as the server answers

No, you have it backwards. This is for the cases where we want the server not to answer in time.

Failure modes also need to be tested!

bcmills commented 1 week ago

FWIW, there's pytest-rerunfailures that recreates fixtures, and seems to have a way to access the .execution_count on the test item.

Looks like that one also relies on undocumented implementation details: https://github.com/pytest-dev/pytest-rerunfailures/blob/a53b9344c0d7a491a3cc53d91c7319696651d21b/src/pytest_rerunfailures.py#L499

bcmills commented 1 week ago

what you describe in particular sounds a lot like a duplicate of #12596 to me.

Yep, that does seem similar! The key difference there, I think, is that they want to run the test until it fails, whereas I want to run it until it succeeds and discard the failure logs — but those parts might already be possible if the fixture-reset problem is addressed.