tests work in 6.2.5 but fail in 7.x.x

Andrei-Pozolotin commented 2 years ago

tests marked with @pytest.mark.forked that work in pytest v. 6.2.5 will fail in pytest v. 7.x.x with message:

AssertionError: previous item was not torn down properly

see: https://github.com/pytest-dev/pytest/issues/9621

Dreamsorcerer commented 1 year ago

Seeing the same failure in aiohttp-devtools: https://github.com/aio-libs/aiohttp-devtools/pull/489 Seems to fail in the setup of the first non-forked test following a forked test (but only once).

webknjaz commented 1 year ago

@Dreamsorcerer I tried debugging this several times, but I lack the knowledge of the pytest internals to get to the bottom of it. So if you ever end up finding out the root cause, make sure to send a PR.

RonnyPfannschmidt commented 1 year ago

this is most likely related to the fact that pytest-forked does crazy things to setupstate in a forked process

since pytest internals are msote strict now it trips

frenzymadness commented 1 year ago

It'd be great to solve this as the pytest 7 is spreading.

frenzymadness commented 1 year ago

The errors I have might be different from the original report. I'm using pytest 7.2.0 and the result is:

platform linux -- Python 3.11.0, pytest-7.2.0, pluggy-1.0.0
rootdir: /builddir/build/BUILD/pytest-forked-1.4.0, configfile: tox.ini
plugins: forked-1.4.0
collected 10 items

testing/test_boxed.py ...xx.                                             [ 60%]
testing/test_xfail_behavior.py F.F.                                      [100%]

=================================== FAILURES ===================================
___________________________ test_xfail[strict xfail] ___________________________

is_crashing = True, is_strict = True
testdir = <Testdir local('/tmp/pytest-of-mockbuild/pytest-0/test_xfail0')>

    @pytest.mark.parametrize(
        ("is_crashing", "is_strict"),
        (
            pytest.param(True, True, id="strict xfail"),
            pytest.param(False, True, id="strict xpass"),
            pytest.param(True, False, id="non-strict xfail"),
            pytest.param(False, False, id="non-strict xpass"),
        ),
    )
    def test_xfail(is_crashing, is_strict, testdir):
        """Test xfail/xpass/strict permutations."""
        # pylint: disable=possibly-unused-variable
        sig_num = signal.SIGTERM.numerator

        test_func_body = (
            "os.kill(os.getpid(), signal.SIGTERM)" if is_crashing else "assert True"
        )

        if is_crashing:
            # marked xfailed and crashing, no matter strict or not
            expected_letter = "x"  # XFAILED
            expected_lowercase = "xfailed"
            expected_word = "XFAIL"
        elif is_strict:
            # strict and not failing as expected should cause failure
            expected_letter = "F"  # FAILED
            expected_lowercase = "failed"
            expected_word = FAILED_WORD
        elif not is_strict:
            # non-strict and not failing as expected should cause xpass
            expected_letter = "X"  # XPASS
            expected_lowercase = "xpassed"
            expected_word = "XPASS"

        session_start_title = "*==== test session starts ====*"
        loaded_pytest_plugins = "plugins: forked*"
        collected_tests_num = "collected 1 item"
        expected_progress = "test_xfail.py {expected_letter!s}*".format(**locals())
        failures_title = "*==== FAILURES ====*"
        failures_test_name = "*____ test_function ____*"
        failures_test_reason = "[XPASS(strict)] The process gets terminated"
        short_test_summary_title = "*==== short test summary info ====*"
        short_test_summary = "{expected_word!s} test_xfail.py::test_function".format(
            **locals()
        )
        if expected_lowercase == "xpassed":
            # XPASS wouldn't have the crash message from
            # pytest-forked because the crash doesn't happen
            short_test_summary = " ".join(
                (
                    short_test_summary,
                    "The process gets terminated",
                )
            )
        reason_string = (
            "  reason: The process gets terminated; "
            "pytest-forked reason: "
            "*:*: running the test CRASHED with signal {sig_num:d}".format(**locals())
        )
        total_summary_line = "*==== 1 {expected_lowercase!s} in 0.*s* ====*".format(
            **locals()
        )

        expected_lines = (
            session_start_title,
            loaded_pytest_plugins,
            collected_tests_num,
            expected_progress,
        )
        if expected_word == FAILED_WORD:
            # XPASS(strict)
            expected_lines += (
                failures_title,
                failures_test_name,
                failures_test_reason,
            )
        expected_lines += (
            short_test_summary_title,
            short_test_summary,
        )
        if expected_lowercase == "xpassed" and expected_word == FAILED_WORD:
            # XPASS(strict)
            expected_lines += (reason_string,)
        expected_lines += (total_summary_line,)

        test_module = testdir.makepyfile(
            f"""
            import os
            import signal

            import pytest

            # The current implementation emits RuntimeWarning.
            pytestmark = pytest.mark.filterwarnings('ignore:pytest-forked xfail')

            @pytest.mark.xfail(
                reason='The process gets terminated',
                strict={is_strict!s},
            )
            @pytest.mark.forked
            def test_function():
                {test_func_body!s}
            """
        )

        pytest_run_result = testdir.runpytest(test_module, "-ra")
>       pytest_run_result.stdout.fnmatch_lines(expected_lines)
E       Failed: fnmatch: '*==== test session starts ====*'
E          with: '============================= test session starts =============================='
E       nomatch: 'plugins: forked*'
E           and: 'platform linux -- Python 3.11.0, pytest-7.2.0, pluggy-1.0.0'
E           and: 'rootdir: /tmp/pytest-of-mockbuild/pytest-0/test_xfail0'
E       fnmatch: 'plugins: forked*'
E          with: 'plugins: forked-1.4.0'
E       exact match: 'collected 1 item'
E       nomatch: 'test_xfail.py x*'
E           and: ''
E       fnmatch: 'test_xfail.py x*'
E          with: 'test_xfail.py x                                                          [100%]'
E       nomatch: '*==== short test summary info ====*'
E           and: ''
E       fnmatch: '*==== short test summary info ====*'
E          with: '=========================== short test summary info ============================'
E       nomatch: 'XFAIL test_xfail.py::test_function'
E           and: 'XFAIL test_xfail.py::test_function - reason: The process gets terminated; pytest-forked reason: :-1: running the test CRASHED with signal 15'
E           and: '============================== 1 xfailed in 0.01s =============================='
E       remains unmatched: 'XFAIL test_xfail.py::test_function'

/builddir/build/BUILD/pytest-forked-1.4.0/testing/test_xfail_behavior.py:122: Failed
----------------------------- Captured stdout call -----------------------------
============================= test session starts ==============================
platform linux -- Python 3.11.0, pytest-7.2.0, pluggy-1.0.0
rootdir: /tmp/pytest-of-mockbuild/pytest-0/test_xfail0
plugins: forked-1.4.0
collected 1 item

test_xfail.py x                                                          [100%]

=========================== short test summary info ============================
XFAIL test_xfail.py::test_function - reason: The process gets terminated; pytest-forked reason: :-1: running the test CRASHED with signal 15
============================== 1 xfailed in 0.01s ==============================
_________________________ test_xfail[non-strict xfail] _________________________

is_crashing = True, is_strict = False
testdir = <Testdir local('/tmp/pytest-of-mockbuild/pytest-0/test_xfail2')>

    @pytest.mark.parametrize(
        ("is_crashing", "is_strict"),
        (
            pytest.param(True, True, id="strict xfail"),
            pytest.param(False, True, id="strict xpass"),
            pytest.param(True, False, id="non-strict xfail"),
            pytest.param(False, False, id="non-strict xpass"),
        ),
    )
    def test_xfail(is_crashing, is_strict, testdir):
        """Test xfail/xpass/strict permutations."""
        # pylint: disable=possibly-unused-variable
        sig_num = signal.SIGTERM.numerator

        test_func_body = (
            "os.kill(os.getpid(), signal.SIGTERM)" if is_crashing else "assert True"
        )

        if is_crashing:
            # marked xfailed and crashing, no matter strict or not
            expected_letter = "x"  # XFAILED
            expected_lowercase = "xfailed"
            expected_word = "XFAIL"
        elif is_strict:
            # strict and not failing as expected should cause failure
            expected_letter = "F"  # FAILED
            expected_lowercase = "failed"
            expected_word = FAILED_WORD
        elif not is_strict:
            # non-strict and not failing as expected should cause xpass
            expected_letter = "X"  # XPASS
            expected_lowercase = "xpassed"
            expected_word = "XPASS"

        session_start_title = "*==== test session starts ====*"
        loaded_pytest_plugins = "plugins: forked*"
        collected_tests_num = "collected 1 item"
        expected_progress = "test_xfail.py {expected_letter!s}*".format(**locals())
        failures_title = "*==== FAILURES ====*"
        failures_test_name = "*____ test_function ____*"
        failures_test_reason = "[XPASS(strict)] The process gets terminated"
        short_test_summary_title = "*==== short test summary info ====*"
        short_test_summary = "{expected_word!s} test_xfail.py::test_function".format(
            **locals()
        )
        if expected_lowercase == "xpassed":
            # XPASS wouldn't have the crash message from
            # pytest-forked because the crash doesn't happen
            short_test_summary = " ".join(
                (
                    short_test_summary,
                    "The process gets terminated",
                )
            )
        reason_string = (
            "  reason: The process gets terminated; "
            "pytest-forked reason: "
            "*:*: running the test CRASHED with signal {sig_num:d}".format(**locals())
        )
        total_summary_line = "*==== 1 {expected_lowercase!s} in 0.*s* ====*".format(
            **locals()
        )

        expected_lines = (
            session_start_title,
            loaded_pytest_plugins,
            collected_tests_num,
            expected_progress,
        )
        if expected_word == FAILED_WORD:
            # XPASS(strict)
            expected_lines += (
                failures_title,
                failures_test_name,
                failures_test_reason,
            )
        expected_lines += (
            short_test_summary_title,
            short_test_summary,
        )
        if expected_lowercase == "xpassed" and expected_word == FAILED_WORD:
            # XPASS(strict)
            expected_lines += (reason_string,)
        expected_lines += (total_summary_line,)

        test_module = testdir.makepyfile(
            f"""
            import os
            import signal

            import pytest

            # The current implementation emits RuntimeWarning.
            pytestmark = pytest.mark.filterwarnings('ignore:pytest-forked xfail')

            @pytest.mark.xfail(
                reason='The process gets terminated',
                strict={is_strict!s},
            )
            @pytest.mark.forked
            def test_function():
                {test_func_body!s}
            """
        )

        pytest_run_result = testdir.runpytest(test_module, "-ra")
>       pytest_run_result.stdout.fnmatch_lines(expected_lines)
E       Failed: fnmatch: '*==== test session starts ====*'
E          with: '============================= test session starts =============================='
E       nomatch: 'plugins: forked*'
E           and: 'platform linux -- Python 3.11.0, pytest-7.2.0, pluggy-1.0.0'
E           and: 'rootdir: /tmp/pytest-of-mockbuild/pytest-0/test_xfail2'
E       fnmatch: 'plugins: forked*'
E          with: 'plugins: forked-1.4.0'
E       exact match: 'collected 1 item'
E       nomatch: 'test_xfail.py x*'
E           and: ''
E       fnmatch: 'test_xfail.py x*'
E          with: 'test_xfail.py x                                                          [100%]'
E       nomatch: '*==== short test summary info ====*'
E           and: ''
E       fnmatch: '*==== short test summary info ====*'
E          with: '=========================== short test summary info ============================'
E       nomatch: 'XFAIL test_xfail.py::test_function'
E           and: 'XFAIL test_xfail.py::test_function - reason: The process gets terminated; pytest-forked reason: :-1: running the test CRASHED with signal 15'
E           and: '============================== 1 xfailed in 0.01s =============================='
E       remains unmatched: 'XFAIL test_xfail.py::test_function'

/builddir/build/BUILD/pytest-forked-1.4.0/testing/test_xfail_behavior.py:122: Failed
----------------------------- Captured stdout call -----------------------------
============================= test session starts ==============================
platform linux -- Python 3.11.0, pytest-7.2.0, pluggy-1.0.0
rootdir: /tmp/pytest-of-mockbuild/pytest-0/test_xfail2
plugins: forked-1.4.0
collected 1 item

test_xfail.py x                                                          [100%]

=========================== short test summary info ============================
XFAIL test_xfail.py::test_function - reason: The process gets terminated; pytest-forked reason: :-1: running the test CRASHED with signal 15
============================== 1 xfailed in 0.01s ==============================
=========================== short test summary info ============================
FAILED testing/test_xfail_behavior.py::test_xfail[strict xfail] - Failed: fnm...
FAILED testing/test_xfail_behavior.py::test_xfail[non-strict xfail] - Failed:...
XFAIL testing/test_boxed.py::test_functional_boxed_capturing[sys] - capture cleanup needed
XFAIL testing/test_boxed.py::test_functional_boxed_capturing[fd] - capture cleanup needed
==================== 2 failed, 6 passed, 2 xfailed in 0.44s ====================

webknjaz commented 1 year ago

@frenzymadness yes, that is different. In your case, it's this the test itself that doesn't match the expected output.

swt2c commented 1 year ago

The error that @frenzymadness noted should be fixed by #74.

teocasse commented 1 year ago

Hi, is there any plan to address this? As pytest 7.x.x usage is spreading this could become a more serious blocker down the road.

RonnyPfannschmidt commented 1 year ago

currently im not aware of anyone wanting to pick this up

teocasse commented 1 year ago

pity, but thanks for the quick reply.

Just want to add that in my case I have the AssertionError even for pytest 6.x.x (I tested several versions). The only difference is the absence of the exception message:

self = <_pytest.runner.SetupState object at 0x1062eb1c0>, colitem = <UnitTestCase TestLoggingConfig>

    def _teardown_with_finalization(self, colitem) -> None:
        self._callfinalizers(colitem)
        colitem.teardown()
        for colitem in self._finalizers:
>           assert colitem in self.stack
E           AssertionError

../../.virtualenvs/my_venv/lib/python3.8/site-packages/_pytest/runner.py:391: AssertionError
============================================================================================== short test summary info ===============================================================================================
ERROR tests/core/utils/test_logging.py::TestLoggingConfig::test_logging - AssertionError

My current workaround is to revert to pytest 5.x.x, then tests pass... but it clearly cannot be a long-term solution, eventually we will have to move to newer pytest versions. Unfortunately I need pytest-forked for a few tests that should be run in a subprocess.

arr-ee commented 5 months ago

I tried digging into this, and there is some progress.

tl;dr: as a workaround, ensure you don't have modules/classes that mix forked and non-forked tests so that a forked test is the last to run. Isolate forked tests into separate module(s), move non-forked tests below forked ones, or just add an empty non-forked test after forked ones. That's it.

So, the crux of the issue is that pytest started doing more validation of its internal state (SetupState), which exposed a design flaw in pytest-forked.

Compare pytest7 and pytest6 — logic is roughly the same, but with a bunch of sanity checks on top. This is good.

What pytest-forked does is it runs the entire test protocol in the subprocess. This includes setup/teardown, which are the bits that are responsible for maintaining SetupState (among other things). Trouble is, running in the subprocess means any changes to shared state never make it to the parent process. From our point of view, setup/teardown phases never ran.

I put together a repro case walking through the issue. Quick recap:

We have two modules and three tests total (four if we include a workaround):

test_issue67_a.py with two (three) tests: a. test_asetup_for_failure is a non-forked test that populates SetupState stack with Session (irrelevant) and <Module testing/test_issue67_a.py> — this is the first step. b. test_yyy is a forked test that runs setup/teardown inside forked process, leaving main process in a bad state. c. (workaround) test_zcleanup is a non-forked test that cleans up state for the module, thus remediating this issue.
test_issue67_z.py with one test: a. test_no_forky is a non-forked test that gets SetupState with <Module testing/test_issue67_a.py> and fails sanity check unless workaround was applied.

The way teardown works is pytest takes "current" and "next" tests, and tears down the stack (which should match collector list for the "current" test) until it matches collector list for the "next" test. What this means for us is that as long as both tests are in the same module, that module remains on the stack. Once "next" test comes from another module, module for "current" is removed from the stack.

Now, back to the repro case:

test_asetup_for_failure pushes module testing/test_issue67_a.py onto the stack. Since next test — test_yyy — is from the same module, said module remains on the stack after teardown.
test_yyy pushes itself onto the stack, then during teardown pytest sees that next test — test_no_forky — comes from a different module, so it pops module testing/test_issue67_a.py from the stack. Except it is done in the subprocess, and not in the main process. Ooops.
test_no_forky is being ran in the main process, where stack is bad. We done.

From this description we can also notice ways to work around the issue:

Don't put stack in invalid state by never pushing forked tests' modules onto stack while forked tests are running. This can be achieved by running forked tests before any non-forked ones within module, or my quarantining forked tests to separate modules.
Clean up after forked tests by making sure that at least one non-forked test exists in module after forked ones.

I tried a couple of approaches to fixing the state from within the library, but my pytest knowledge is lacking. So far the only potential approach I see is moving setup/teardown back into main process, which is a breaking change for those who relied on forks for resource/state (mis)management. I'll play some more with it, and if it works with some existing codebases I'll open a PR.

webknjaz commented 3 months ago

~FTR here's a forked plugin re-implementation that doesn't seem to suffer from this flaw: https://github.com/ansible/ansible/pull/80525.~ Nevermind. It's probably affected the same way.

pytest-dev / pytest-forked

tests work in 6.2.5 but fail in 7.x.x #67