pytest-dev / pytest

The pytest framework makes it easy to write small tests, yet scales to support complex functional testing
https://pytest.org
MIT License
11.7k stars 2.6k forks source link

Teardown order mismatch with parametrized parent fixtures and scopes #12134

Open jakkdl opened 4 months ago

jakkdl commented 4 months ago

Found during discussion of #11833 (I have reduced the size of the repro since I posted it in https://github.com/pytest-dev/pytest/pull/11833#issuecomment-1999818465)

import pytest

@pytest.fixture(scope="module", params=["a", "b"])
def fixture_1(request):
    print("setup 1", request.param)
    yield
    print("\nteardown 1", request.param)

@pytest.fixture(scope="module")
def fixture_2():
    print("setup 2")
    yield
    print("teardown 2")

def test_1(fixture_1, fixture_2): ...

output:

setup 1  a
setup 2
.
teardown 1 a
setup 1  b
.
teardown 1 b
teardown 2 # 2 is torn down last, perhaps surprisingly

If we reverse the order of the fixture parameters to test_1 we get an ordering where the stack is fully respected:

def test_1(fixture_2, fixture_1): ...
setup 2
setup 1 a
.
teardown 1 a
setup 1 b
.
teardown 1 b
teardown 2

What's happening in the first example is relatively straightforward -

  1. test_1 requests fixture_2, which is set up and its finalizer is queued.
  2. test_1 requests fixture_1[a], which is set up and its finalizer is queued.
  3. Test runs
  4. Because fixture_1 is parametrized, we run the test again.
  5. test_1 requests fixture_2 - and since it has module scope the value is cached, so no setup is run (and since #11833 no finalizer is queued, not that it would make any difference here)
  6. test_1 requests fixture_1[b], it has a different cache key, so fixture_1[a] is torn down, and fixture_1[b] is setup with its finalizer queued.
  7. Test finishes, and module is done, so we run the finalizers in reverse order they were added - which means fixture_1[b] first, then fixture_2, (and finally fixture_1[a] is attempted as its still on the stack, but thats also irrelevant here).

I'm not even sure this is a bug per se, but I could at least see it being surprising to users as the documentation does not imply that the stack order is ever disrespected:

https://docs.pytest.org/en/8.0.x/how-to/fixtures.html#note-on-finalizer-order

Finalizers are executed in a first-in-last-out order. For yield fixtures, the first teardown code to run is from the right-most fixture, i.e. the last test parameter.

https://docs.pytest.org/en/8.0.x/how-to/fixtures.html#yield-fixtures-recommended

Once the test is finished, pytest will go back down the list of fixtures, but in the reverse order, taking each one that yielded, and running the code inside it that was after the yield statement.

Possible fixes:

  1. When interpreting the fixture parameters, magically reorder fixtures such that stack teardown ordering is not broken - in our example turning the first example into the second.
    • This would probably be quite disruptive, and sounds very hard to code, so I don't think it's an option.
  2. Emit a warning when teardown ordering is possibly violated. Not super trivial to code, but when it's only controlling whether a warning is shown or not it needn't be perfect.
  3. Mention in the documentation that teardown order is not guaranteed to be the reverse, in the case of larger-scoped fixtures + parametrization.
  4. Also tear down fixture_2 when fixture_1[a] is torn down.

I don't think this is a huge problem though, as I think in most practical cases an end user can work around this issue by changing fixture order, make fixture_2 depend on fixture_1, change scoping, etc, which makes me favor option 3 and possibly also 2.

Related to #4871

charles-cooper commented 4 months ago

i just ran into this issue after checking out the fixes from https://github.com/pytest-dev/pytest/pull/11833 (specifically i am on commit 2607fe8b4706fa701925db50f7892ddff6ed2928) the following fixture setup will result in out-of-order teardown wrt setup:

import pytest

@pytest.fixture(scope='module', autouse=True, params=[7,8,9])
def fixture_autouse(request):
    print(f"SETUP FIXTURE AUTOUSE {request.param}")
    yield
    print(f"TEARDOWN FIXTURE AUTOUSE {request.param}")

@pytest.fixture(scope='module', params=[1,2,3])
def fixture_test(request):
    print(f"SETUP FIXTURE TEST {request.param}")
    yield
    print(f"TEARDOWN FIXTURE TEST {request.param}")

def test_1(fixture_test):
    pass

def test_2():
    pass

results in

tests/test_fixtures.py SETUP FIXTURE AUTOUSE 7
SETUP FIXTURE TEST 1
.TEARDOWN FIXTURE TEST 1
SETUP FIXTURE TEST 2
.TEARDOWN FIXTURE AUTOUSE 7
SETUP FIXTURE AUTOUSE 8
.TEARDOWN FIXTURE TEST 2
SETUP FIXTURE TEST 1
.TEARDOWN FIXTURE TEST 1
SETUP FIXTURE TEST 3
.TEARDOWN FIXTURE AUTOUSE 8
SETUP FIXTURE AUTOUSE 7
.TEARDOWN FIXTURE AUTOUSE 7
SETUP FIXTURE AUTOUSE 9
.TEARDOWN FIXTURE TEST 3
SETUP FIXTURE TEST 2
.TEARDOWN FIXTURE TEST 2
SETUP FIXTURE TEST 1
..TEARDOWN FIXTURE AUTOUSE 9
SETUP FIXTURE AUTOUSE 8
.TEARDOWN FIXTURE AUTOUSE 8
SETUP FIXTURE AUTOUSE 7
.TEARDOWN FIXTURE AUTOUSE 7
TEARDOWN FIXTURE TEST 1

i think it's extremely useful to have guaranteed fifo order to teardown wrt setup. for example, these context managers must respect stack in/out ordering: https://github.com/vyperlang/titanoboa/blob/f783a32fbed7a98112acfc494a78e27d1388fa66/boa/test/plugin.py#L54-L63, and running the finalizers out of order is a bug.

charles-cooper commented 4 months ago

in this case i think the issue i am having has more to do with the fact that SETUP FIXTURE TEST 2 is happening before TEARDOWN FIXTURE AUTOUSE 7.

jakkdl commented 4 months ago

Yeah I don't think there's actually any teardown order mismatch in your example? And SETUP FIXTURE TEST 2 happens at an entirely different time point than TEARDOWN FIXTURE AUTOUSE 7, so pytest can't simply reorder those.

(running with -v, and slightly editing the paste for readability)

testing/python/test_charles_cooper_issue.py::test_1[7-1]
SETUP FIXTURE AUTOUSE 7
SETUP FIXTURE TEST 1
PASSED

testing/python/test_charles_cooper_issue.py::test_1[7-2]
TEARDOWN FIXTURE TEST 1
SETUP FIXTURE TEST 2
PASSED

testing/python/test_charles_cooper_issue.py::test_1[8-2]
TEARDOWN FIXTURE AUTOUSE 7
SETUP FIXTURE AUTOUSE 8
PASSED

testing/python/test_charles_cooper_issue.py::test_1[8-1]
TEARDOWN FIXTURE TEST 2
SETUP FIXTURE TEST 1
PASSED

testing/python/test_charles_cooper_issue.py::test_1[8-3]
TEARDOWN FIXTURE TEST 1
SETUP FIXTURE TEST 3
PASSED

testing/python/test_charles_cooper_issue.py::test_1[7-3]
TEARDOWN FIXTURE AUTOUSE 8
SETUP FIXTURE AUTOUSE 7
PASSED

testing/python/test_charles_cooper_issue.py::test_1[9-3]
TEARDOWN FIXTURE AUTOUSE 7
SETUP FIXTURE AUTOUSE 9
PASSED

testing/python/test_charles_cooper_issue.py::test_1[9-2]
TEARDOWN FIXTURE TEST 3
SETUP FIXTURE TEST 2
PASSED

testing/python/test_charles_cooper_issue.py::test_1[9-1]
TEARDOWN FIXTURE TEST 2
SETUP FIXTURE TEST 1
PASSED

testing/python/test_charles_cooper_issue.py::test_2[9]
PASSED
testing/python/test_charles_cooper_issue.py::test_2[8]
TEARDOWN FIXTURE AUTOUSE 9
SETUP FIXTURE AUTOUSE 8
PASSED

testing/python/test_charles_cooper_issue.py::test_2[7]
TEARDOWN FIXTURE AUTOUSE 8
SETUP FIXTURE AUTOUSE 7
PASSED

TEARDOWN FIXTURE AUTOUSE 7
TEARDOWN FIXTURE TEST 1

So maybe what's surprising you is pytest reordering the parametrization to minimize the number of required setup & teardowns, which I feel like I've read about somewhere but I can't find it now. Regardless, I think discussion about that should be continued in a different issue.

RonnyPfannschmidt commented 4 months ago

fixtures that dont share dependencies are not considered stacked

in fact they are considered independent to minimize the amount of setups/teardowns and the goal is to minimize setup/teardown of those fixtures

with a larger set of fixtures the set of minimal setups/teardowns can be slightly confusing because the reordering makes it slightly more effective than just stacked

charles-cooper commented 4 months ago

i think considering them as stacked may be useful, as there are cases where setup/teardown needs to be stacked. or at least there could be some way to "request" stacking.

RonnyPfannschmidt commented 4 months ago

stacking is requested by dependency

jakkdl commented 3 months ago

Yeah I don't think there's actually any teardown order mismatch in your example? And SETUP FIXTURE TEST 2 happens at an entirely different time point than TEARDOWN FIXTURE AUTOUSE 7, so pytest can't simply reorder those.

Sorry, re-reading the comments now I do think this is the same "problem" - which as Ronny says is more of a feature than a bug. So we probably should just update the docs to mention that teardown ordering is not guaranteed to be opposite of setup order (due to minimization of setup/teardowns); and to suggest adding dependencies if that is a hard requirement in parametrization. (and indeed, adding fixture_2 as a dependency to fixture_1 would resolve my original example)