tests fail unreliably with pytest direct

drew-parsons commented 3 years ago

There seems to be fragility in the pytest-mpi 0.5 tests. I'm building and running on Debian unstable. Possibly I'm not triggering the tests correctly, so let me know if that my problem. These errors occur with python3 -m pytest -v -p pytester . I can confirm that tests do pass when run via tox (apart from a manifest error due to the debian subdir).

I gather the --with-mpi is mandatory, and python3 -m pytest -v -p pytester is expected to fail. But I get failure with python3 -m pytest -v -p pytester --with-mpi, and a different failure each time depending on the location of the verbose -v flag.

I've copied the tests dir to a separate dir to ensure the installed pytest-mpi is being invoked, not the source code.

Summary, running inside the test dir:

without -v,

test_fixtures.py .FF                                                                                                                                                                                        [ 30%]
test_markers.py .F..F.F                                                                                                                                                                                     [100%]
==== 5 failed, 5 passed, 7 warnings in 25.87s ===

(fails: test_mpi_tmpdir, test_mpi_tmp_path, test_mpi_with_mpi, test_mpi_skip_under_mpi, test_mpi_xfail_under_mpi)

python3 -m pytest -v -p pytester --with-mpi,

test_fixtures.py::test_mpi_file_name PASSED                                                                                                                                                                 [ 10%]
test_fixtures.py::test_mpi_tmpdir PASSED                                                                                                                                                                    [ 20%]
test_fixtures.py::test_mpi_tmp_path FAILED                                                                                                                                                                  [ 30%]
test_markers.py::test_mpi PASSED                                                                                                                                                                            [ 40%]
test_markers.py::test_mpi_with_mpi FAILED                                                                                                                                                                   [ 50%]
test_markers.py::test_mpi_only_mpi FAILED                                                                                                                                                                   [ 60%]
test_markers.py::test_mpi_skip PASSED                                                                                                                                                                       [ 70%]
test_markers.py::test_mpi_skip_under_mpi PASSED                                                                                                                                                             [ 80%]
test_markers.py::test_mpi_xfail PASSED                                                                                                                                                                      [ 90%]
test_markers.py::test_mpi_xfail_under_mpi FAILED                                                                                                                                                            [100%]
=== 4 failed, 6 passed, 7 warnings in 28.26s ===

python3 -m pytest -p pytester -v --with-mpi,

test_fixtures.py::test_mpi_file_name FAILED                                                                                                                                                                 [ 10%]
test_fixtures.py::test_mpi_tmpdir PASSED                                                                                                                                                                    [ 20%]
test_fixtures.py::test_mpi_tmp_path FAILED                                                                                                                                                                  [ 30%]
test_markers.py::test_mpi PASSED                                                                                                                                                                            [ 40%]
test_markers.py::test_mpi_with_mpi FAILED                                                                                                                                                                   [ 50%]
test_markers.py::test_mpi_only_mpi PASSED                                                                                                                                                                   [ 60%]
test_markers.py::test_mpi_skip PASSED                                                                                                                                                                       [ 70%]
test_markers.py::test_mpi_skip_under_mpi FAILED                                                                                                                                                             [ 80%]
test_markers.py::test_mpi_xfail PASSED                                                                                                                                                                      [ 90%]
test_markers.py::test_mpi_xfail_under_mpi PASSED                                                                                                                                                            [100%]
=== 4 failed, 6 passed, 7 warnings in 25.35s ===

(a different 4 tests failing)

python3 -m pytest -p pytester --with-mpi -v,

test_fixtures.py::test_mpi_file_name PASSED                                                                                                                                                                 [ 10%]
test_fixtures.py::test_mpi_tmpdir FAILED                                                                                                                                                                    [ 20%]
test_fixtures.py::test_mpi_tmp_path PASSED                                                                                                                                                                  [ 30%]
test_markers.py::test_mpi PASSED                                                                                                                                                                            [ 40%]
test_markers.py::test_mpi_with_mpi PASSED                                                                                                                                                                   [ 50%]
test_markers.py::test_mpi_only_mpi PASSED                                                                                                                                                                   [ 60%]
test_markers.py::test_mpi_skip PASSED                                                                                                                                                                       [ 70%]
test_markers.py::test_mpi_skip_under_mpi PASSED                                                                                                                                                             [ 80%]
test_markers.py::test_mpi_xfail PASSED                                                                                                                                                                      [ 90%]
test_markers.py::test_mpi_xfail_under_mpi FAILED                                                                                                                                                            [100%]
=== 2 failed, 8 passed, 7 warnings in 26.19s ===

(now only 2 tests failing)

I get different results if I run from the parent dir above the tests subdir. Invoking -k test_markers, all test_markers tests passed, But the success is not reliable. Running python3 -m pytest -p pytester --with-mpi -k 'test_markers' a second time from the parent dir, test_mpi_only_mpi fails. I guess the test cache files ( .pytest_cache) must involved in the failure. But removing .pytest_cache is not sufficient. Repeating rm -rf .pytest_cache/; python3 -m pytest -p pytester --with-mpi -k 'test_markers' multiple times from the parent directory, I variously get 4, 3 or 1 error.

The failure is apparently random, which suggests it might be related to file flushing, or synchronicity issues. The tests fail rather than hanging.

A sample full error output is

________________________________________________________________________________________________ test_mpi_only_mpi ________________________________________________________________________________________________

mpi_testdir = <conftest.MPITestdir object at 0x7fb7d45af5b0>, has_mpi4py = True

    def test_mpi_only_mpi(mpi_testdir, has_mpi4py):
        mpi_testdir.makepyfile(MPI_TEST_CODE)

        result = mpi_testdir.runpytest("--only-mpi")

        if has_mpi4py:
>           result.assert_outcomes(**_fix_plural(passed=2, errors=1, skipped=2))

/home/drew/projects/python/build/test/tests/test_markers.py:83: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/usr/lib/python3/dist-packages/_pytest/pytester.py:461: in parseoutcomes
    return self.parse_summary_nouns(self.outlines)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

cls = <class '_pytest.pytester.RunResult'>
lines = ['============================= test session starts ==============================', 'platform linux -- Python 3.9.2, ...xvfb-1.2.0', 'collecting ... ', 'collected 5 items                                                              ', ...]

    @classmethod
    def parse_summary_nouns(cls, lines) -> Dict[str, int]:
        """Extracts the nouns from a pytest terminal summary line.

        It always returns the plural noun for consistency::

            ======= 1 failed, 1 passed, 1 warning, 1 error in 0.13s ====

        Will return ``{"failed": 1, "passed": 1, "warnings": 1, "errors": 1}``
        """
        for line in reversed(lines):
            if rex_session_duration.search(line):
                outcomes = rex_outcome.findall(line)
                ret = {noun: int(count) for (count, noun) in outcomes}
                break
        else:
>           raise ValueError("Pytest terminal summary report not found")
E           ValueError: Pytest terminal summary report not found

/usr/lib/python3/dist-packages/_pytest/pytester.py:479: ValueError
----------------------------------------------------------------------------------------------- Captured log setup ------------------------------------------------------------------------------------------------
WARNING  conftest:conftest.py:34 To run the MPI tests, you need to use subprocesses
---------------------------------------------------------------------------------------------- Captured stdout call -----------------------------------------------------------------------------------------------
running: mpirun -n 2 /usr/bin/python3 -mpytest --basetemp=/tmp/pytest-of-drew/pytest-28/test_mpi_only_mpi0/runpytest-0 --only-mpi
     in: /tmp/pytest-of-drew/pytest-28/test_mpi_only_mpi0
============================= test session starts ==============================
platform linux -- Python 3.9.2, pytest-6.0.2, py-1.10.0, pluggy-0.13.0
rootdir: /tmp/pytest-of-drew/pytest-28/test_mpi_only_mpi0
plugins: mpi-0+untagged.49.g4417f26, cov-2.10.1, doctestplus-0.9.0, remotedata-0.3.2, asyncio-0.14.0, filter-subpackage-0.1.1, arraydiff-0.3, astropy-header-0.1.2, hypothesis-5.43.3, openfiles-0.5.0, xvfb-1.2.0
collecting ... 
collected 5 items                                                              

test_mpi_only_mpi.py 
---------------------------------------------------------------------------------------------- Captured stderr call -----------------------------------------------------------------------------------------------
INTERNALERROR> Traceback (most recent call last):
INTERNALERROR>   File "/usr/lib/python3/dist-packages/_pytest/main.py", line 236, in wrap_session
INTERNALERROR>     config._do_configure()
INTERNALERROR>   File "/usr/lib/python3/dist-packages/_pytest/config/__init__.py", line 911, in _do_configure
INTERNALERROR>     self.hook.pytest_configure.call_historic(kwargs=dict(config=self))
INTERNALERROR>   File "/usr/lib/python3/dist-packages/pluggy/hooks.py", line 308, in call_historic
INTERNALERROR>     res = self._hookexec(self, self.get_hookimpls(), kwargs)
INTERNALERROR>   File "/usr/lib/python3/dist-packages/pluggy/manager.py", line 92, in _hookexec
INTERNALERROR>     return self._inner_hookexec(hook, methods, kwargs)
INTERNALERROR>   File "/usr/lib/python3/dist-packages/pluggy/manager.py", line 83, in <lambda>
INTERNALERROR>     self._inner_hookexec = lambda hook, methods, kwargs: hook.multicall(
INTERNALERROR>   File "/usr/lib/python3/dist-packages/pluggy/callers.py", line 208, in _multicall
INTERNALERROR>     return outcome.get_result()
INTERNALERROR>   File "/usr/lib/python3/dist-packages/pluggy/callers.py", line 80, in get_result
INTERNALERROR>     raise ex[1].with_traceback(ex[2])
INTERNALERROR>   File "/usr/lib/python3/dist-packages/pluggy/callers.py", line 187, in _multicall
INTERNALERROR>     res = hook_impl.function(*args)
INTERNALERROR>   File "/usr/lib/python3/dist-packages/pytest_xvfb.py", line 93, in pytest_configure
INTERNALERROR>     config.xvfb.start()
INTERNALERROR>   File "/usr/lib/python3/dist-packages/pytest_xvfb.py", line 55, in start
INTERNALERROR>     raise XvfbExitedError("Xvfb exited with exit code {0}\nXvfb stdout:\n    {1}\nXvfb stderr:\n    {2}".format(
INTERNALERROR> pytest_xvfb.XvfbExitedError: Xvfb exited with exit code 1
INTERNALERROR> Xvfb stdout:
INTERNALERROR>     
INTERNALERROR> Xvfb stderr:
INTERNALERROR>     (EE) 
INTERNALERROR>     Fatal server error:
INTERNALERROR>     (EE) Server is already active for display 1189
INTERNALERROR>      If this server is no longer running, remove /tmp/.X1189-lock
INTERNALERROR>      and start again.
INTERNALERROR>     (EE)
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[18078,1],0]
  Exit code:    3
--------------------------------------------------------------------------
_____________________________________________________________________________________________ test_mpi_skip_under_mpi _____________________________________________________________________________________________

mpi_testdir = <conftest.MPITestdir object at 0x7fb7d44a9040>

    def test_mpi_skip_under_mpi(mpi_testdir):
        mpi_testdir.makepyfile(MPI_SKIP_TEST_CODE)

        result = mpi_testdir.runpytest("--with-mpi")

>       result.assert_outcomes(skipped=1)

/home/drew/projects/python/build/test/tests/test_markers.py:101: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/usr/lib/python3/dist-packages/_pytest/pytester.py:461: in parseoutcomes
    return self.parse_summary_nouns(self.outlines)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

cls = <class '_pytest.pytester.RunResult'>
lines = ['============================= test session starts ==============================', 'platform linux -- Python 3.9.2, ...xvfb-1.2.0', 'collecting ... ', 'collected 1 item                                                               ', ...]

    @classmethod
    def parse_summary_nouns(cls, lines) -> Dict[str, int]:
        """Extracts the nouns from a pytest terminal summary line.

        It always returns the plural noun for consistency::

            ======= 1 failed, 1 passed, 1 warning, 1 error in 0.13s ====

        Will return ``{"failed": 1, "passed": 1, "warnings": 1, "errors": 1}``
        """
        for line in reversed(lines):
            if rex_session_duration.search(line):
                outcomes = rex_outcome.findall(line)
                ret = {noun: int(count) for (count, noun) in outcomes}
                break
        else:
>           raise ValueError("Pytest terminal summary report not found")
E           ValueError: Pytest terminal summary report not found

/usr/lib/python3/dist-packages/_pytest/pytester.py:479: ValueError
----------------------------------------------------------------------------------------------- Captured log setup ------------------------------------------------------------------------------------------------
WARNING  conftest:conftest.py:34 To run the MPI tests, you need to use subprocesses
---------------------------------------------------------------------------------------------- Captured stdout call -----------------------------------------------------------------------------------------------
running: mpirun -n 2 /usr/bin/python3 -mpytest --basetemp=/tmp/pytest-of-drew/pytest-28/test_mpi_skip_under_mpi0/runpytest-0 --with-mpi
     in: /tmp/pytest-of-drew/pytest-28/test_mpi_skip_under_mpi0

That last message "To run the MPI tests, you need to use subprocesses". Does it mean python3 -m pytest -p pytester or `python3 -m pytest -p pytester --with-mpi is no longer expected to work? Again, I reiterate that the test pass when run via tox.

aragilar commented 3 years ago

@drew-parsons This is with pytest in unstable (for autopackagetest)? I'm not sure, it should work, and the invocation looks fine. I'll look into this though. There were no changes in terms of the actual code (i.e. what is needed for h5py and others), it was mainly getting both pytest <6 and pytest >= 6 to work with the texts.

FYI, I'm happy to accept a PR which adds autopackagetest to the things that need to pass to merge.

aragilar commented 3 years ago

Wait, sorry, --with-mpi shouldn't be given to the pytest which sets up the environment (which uses pytester to call pytest again with and without --with-mpi). The tree of processes looks like:

pytest -p pytester
-> mpirun -n 2
---> pytest --with-mpi <generated_test_file.py>

Some of the inner pytest tests should fail or error, as that's testing the behaviour of the options when the environment is not set up correctly when running the tests for a project which uses the plugin (e.g. h5py).

I'm also slightly confused by the Xvfb references in the tests, as we don't need it (and it wouldn't surprise me if that plugin was trying to spawn subprocesses, which could really confuse MPI).

drew-parsons commented 3 years ago

The behaviour I'm reporting makes more sense if I shouldn't be providing --with-mpi to pytest anyway. Tests expected to fail would get acknowledged as xfail.

I agree, the Xvfb errors are rather odd. I'm not sure how to interpret them. What is tox doing differently to pytest launched directly on the command line?

Adding autopkgtest to your test cases is a good idea. I'll push a PR if I can figure out how github manages its CI to handle autopkgtest. It would mean you'd have to have your test machine generate and install .deb packages, or at least have a debian/tests directory available to from. I've switched the debian package over to tox so debci should be less volatile now.

drew-parsons commented 3 years ago

I think it might be constructive to focus attention on the Xvfb error here. I think it's the root cause (or symptom) of the problems. I'm having trouble also running h5py tests --with-mpi, with random successes and failures (running as mpirun -n 4 python3 -c "import h5py; h5py.run_tests('-v --with-mpi -k mpi')")

The problem with h5py exhibits as a hang in h5py's tests/test_file.py::TestMPI::test_mpio. I've noticed that when it hangs, it also emits the Xvfb error, same as we saw above in pytest-mpi itself. When h5py test_mpio does not hang, there is no Xvfb error.

Running the h5py test repeatedly, it seems to suffer the Xvfb error about 50% of the time. My guess is that the randomness might be in the order in which the pytest modules are loaded, whether pytest loads pytest-xvfb or pytest-mpi first.

I can work around the problem by running tests with --no-xvfb. But is there something that can be done in pytest-mpi to make it more robust when pytest-xvfb is present? (or equivalently, can a patch be identified for pytest-xvfb?)

drew-parsons commented 3 years ago

Incidentally, all the pytest plugins seem get loaded by pytest simply by being present, regardless of whether the specific test at hand uses them. The plugins tag for h5py is

plugins: cov-2.10.1, mpi-0+unknown, doctestplus-0.9.0, remotedata-0.3.2, asyncio-0.14.0, filter-subpackage-0.1.1, arraydiff-0.3, astropy-header-0.1.2, hypothesis-5.43.3, openfiles-0.5.0, xvfb-1.2.0

more or less the same as for pytest-mpi itself. If that's the case then you might be able to reproduce the error (50% of the time) simply by installing pytest-xvfb, if it's not already installed.

Indeed, I can reproduce the Xvfb error reliably just by running mpirun -n 4 pytest-3 in an empty directory (no tests). All plugins are still listed by pytest, including both mpi and xvfb, even without adding --with-mpi.

ArchangeGabriel commented 3 years ago

@drew-parsons Did you try running with --runpytest=subprocess? I’m running with pytest tests -p pytester --runpytest=subprocess and it works when using an installed pytest-mpi.

However I don’t understand how inplace (i.e. without installing to the system) testing is supposed to work. It fails with __main__.py: error: unrecognized arguments: --with-mpi, which is somewhat expected if pytest does not know about the being tested pytest-mpi. Any idea?

drew-parsons commented 3 years ago

No, it's still giving the same Xvfb error if I run with --runpytest=subprocess (on a Debian system, pytest-3 tests -p pytester --runpytest=subprocess)

ArchangeGabriel commented 3 years ago

Ah sorry, I misunderstood your issue. So it’s not really pytest-mpi fault in your case anyway… Curious though, why is there xvfb in your building environment since it’s not required (both here and for h5py)?

drew-parsons commented 3 years ago

My system has python3-pytest-xvfb. The pytest environment seems to be loading all modules that happen to be available on the system.

ArchangeGabriel commented 3 years ago

The last part should be a pytest issue instead I think. Regarding the first part, I would advise building in chroot/container, it avoids “contamination” by things available on the system. ;)

drew-parsons commented 3 years ago

Well, sure, the official build is in a chroot :)

aragilar / pytest-mpi

tests fail unreliably with pytest direct #31