subprocess.CalledProcessError

pancetta commented 3 years ago

Great approach, exactly what I would need. However, after installing it I cannot get it to work. Even with a

@mpi_parallel(1)
def test_parallel():
    assert True

I get subprocess.CalledProcessError, saying that the command returned the error code 15. When I run the command by hand on the command line, all works well. Could be an environment thing? I've no idea how to debug or even fix that.

I'm running on macOS with Python 3.8 from miniconda. Any help is appreciated!

pancetta commented 3 years ago

Aha, so this is not a conda thing nor a general MPICH/OpenMPI one, but rather macOS? Then this is probably a not-so-relevant edge case as most users do not run stuff on macOS, I'd say.

NOhs commented 3 years ago

Not sure. The open-mpi community seemed quite interested in getting the issues with open-mpi hanging on osx fixed. So there might still be hope.

NOhs commented 3 years ago

The issue should be fixed on branch: https://github.com/NOhs/pytest-MPI/tree/fix_subprocess_mpi_hang

Can you verify?

pancetta commented 3 years ago

Holy guacamole! You made it! It works with MPICH now, both for assert True and for assert False. Thanks!

pancetta commented 3 years ago

With OpenMPI I get an UnboundLocalError:

======================================================================================================================= test session starts ========================================================================================================================
platform darwin -- Python 3.8.5, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
rootdir: /Users/robert/Documents/codes/performance
plugins: MPI-0.0.1.dev6
collected 2 items                                                                                                                                                                                                                                                  

test_mpi.py .F                                                                                                                                                                                                                                               [100%]

============================================================================================================================= FAILURES =============================================================================================================================
__________________________________________________________________________________________________________________________ test_parallel ___________________________________________________________________________________________________________________________

args = (), kwargs = {}, executable = 'mpirun', test_name = 'test_mpi.py::test_parallel', failed = True, errors = [], i = 1, file_name = 'test_mpi.py..test_parallel_1'

    @functools.wraps(func)
    def replacement_func(*args, **kwargs):
        # __tracebackhide__ = True
        if not in_mpi_session():
            executable = mpi_executable(mpi_executable_name)
            test_name = get_pytest_input(func)

            failed = False

            try:
                mpi_subprocess_output = subprocess.check_output(
                    [
                        executable,
                        "-np",
                        str(nprocs),
                        sys.executable,
                        "-m",
                        "mpi4py",
                        "-m",
                        "pytest_MPI._print_capture",
                        test_name,
                    ],
                    stderr=subprocess.STDOUT,
                )
            except subprocess.CalledProcessError as error:
                failed = True
                # print(error.output.decode('utf-8'))
                errors = []
                for i in range(nprocs):
                    file_name = f"{get_filename(test_name)}_{i}"
                    if os.path.isfile(file_name):
                        with open(file_name) as f:
                            rank_output = f.read()
                        os.remove(file_name)

                        if not contains_failure(rank_output):
                            continue

                        errors.append((i, rank_output))

                for rank, message in errors:
                    header_1 = f"Rank {rank}"
                    header_2 = f" reported an error:"
                    header = f"{Style.BRIGHT}{Fore.RED}{header_1}{Style.RESET_ALL}{header_2}"
                    print("\n" + header)
                    print("- " * (len(header_1 + header_2)//2 + 1))
                    print(get_traceback(message))

            if failed:
>               pytest.fail(get_summary(message))
E               UnboundLocalError: local variable 'message' referenced before assignment

../../../miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pytest_MPI/_decorator.py:134: UnboundLocalError
===================================================================================================================== short test summary info ======================================================================================================================
FAILED test_mpi.py::test_parallel - UnboundLocalError: local variable 'message' referenced before assignment
=================================================================================================================== 1 failed, 1 passed in 0.08s ====================================================================================================================

NOhs commented 3 years ago

Hm. Did you add the environment variable again, the export PMIX_MCA_gds=hash stuff?

pancetta commented 3 years ago

No, but now I did, no change.

NOhs commented 3 years ago

I added better output for that case. Can you see what it prints?

pancetta commented 3 years ago

I don't see where you initialize message. It looks like it is only defined in the loop here.

NOhs commented 3 years ago

Well if message is not defined, then failed should not be true. Since that might be wishful thinking, I adapted the branch a bit.

pancetta commented 3 years ago

Better, but now:

FAILED test_mpi.py::test_parallel - UnboundLocalError: local variable 'error' referenced before assignment

NOhs commented 3 years ago

And now?

pancetta commented 3 years ago

Well..

            if failed:
                if errors:
                    pytest.fail(get_summary(message))
                else:
>                   pytest.fail(alternative_output)
E                   TypeError: Failed expected string as 'msg' parameter, got 'bytes' instead.
E                   Perhaps you meant to use a mark?

../../../miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pytest_MPI/_decorator.py:137: TypeError
===================================================================================================================== short test summary info ======================================================================================================================
FAILED test_mpi.py::test_parallel - TypeError: Failed expected string as 'msg' parameter, got 'bytes' instead.
=================================================================================================================== 1 failed, 1 passed in 0.09s ====================================================================================================================

NOhs commented 3 years ago

Urgh, this is not the most effective way of debugging :D, sec

pancetta commented 3 years ago

I feel like SLURM.. but that's ok, it's you helping me.

NOhs commented 3 years ago

It should print something more useful now.

pancetta commented 3 years ago

Meh..

            if failed:
                if errors:
                    pytest.fail(get_summary(message))
                else:
>                   pytest.fail(alternative_output)
E                   Failed: <Failed instance>

../../../miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pytest_MPI/_decorator.py:138: Failed
===================================================================================================================== short test summary info ======================================================================================================================
FAILED test_mpi.py::test_parallel - Failed: <Failed instance>
=================================================================================================================== 1 failed, 1 passed in 0.08s ====================================================================================================================

NOhs commented 3 years ago

I see the same issue on Linux with openmpi. I suggest moving forward with MPICH for the time being until I figure out the issue with openmpi.

pancetta commented 3 years ago

OK, sounds good. Thank you for your extensive help.

NOhs commented 3 years ago

The latest push to the master branch together with the change to the given example (note the import moved into the MPI function) should fix the issues mentioned here.

NOhs commented 3 years ago

FYI I had to change the package name to avoid a name conflict with an existing package.

pancetta commented 3 years ago

Hmm.. I get this error now, both with OpenMPI and MPICH:

Traceback (most recent call last):
  File "/Users/robert/miniconda3/envs/performance_openmpi/bin/pytest", line 8, in <module>
    sys.exit(console_main())
  File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/_pytest/config/__init__.py", line 187, in console_main
    code = main()
  File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/_pytest/config/__init__.py", line 143, in main
    config = _prepareconfig(args, plugins)
  File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/_pytest/config/__init__.py", line 318, in _prepareconfig
    config = pluginmanager.hook.pytest_cmdline_parse(
  File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/hooks.py", line 286, in __call__
    return self._hookexec(self, self.get_hookimpls(), kwargs)
  File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/manager.py", line 93, in _hookexec
    return self._inner_hookexec(hook, methods, kwargs)
  File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/manager.py", line 84, in <lambda>
    self._inner_hookexec = lambda hook, methods, kwargs: hook.multicall(
  File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/callers.py", line 203, in _multicall
    gen.send(outcome)
  File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/_pytest/helpconfig.py", line 100, in pytest_cmdline_parse
    config = outcome.get_result()  # type: Config
  File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/callers.py", line 80, in get_result
    raise ex[1].with_traceback(ex[2])
  File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/callers.py", line 187, in _multicall
    res = hook_impl.function(*args)
  File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/_pytest/config/__init__.py", line 1003, in pytest_cmdline_parse
    self.parse(args)
  File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/_pytest/config/__init__.py", line 1280, in parse
    self._preparse(args, addopts=addopts)
  File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/_pytest/config/__init__.py", line 1172, in _preparse
    self.pluginmanager.load_setuptools_entrypoints("pytest11")
  File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/manager.py", line 300, in load_setuptools_entrypoints
    self.register(plugin, name=ep.name)
  File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/_pytest/config/__init__.py", line 436, in register
    ret = super().register(plugin, name)  # type: Optional[str]
  File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/manager.py", line 127, in register
    hook._maybe_apply_history(hookimpl)
  File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/hooks.py", line 333, in _maybe_apply_history
    res = self._hookexec(self, [method], kwargs)
  File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/manager.py", line 93, in _hookexec
    return self._inner_hookexec(hook, methods, kwargs)
  File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/manager.py", line 84, in <lambda>
    self._inner_hookexec = lambda hook, methods, kwargs: hook.multicall(
  File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/callers.py", line 208, in _multicall
    return outcome.get_result()
  File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/callers.py", line 80, in get_result
    raise ex[1].with_traceback(ex[2])
  File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/callers.py", line 187, in _multicall
    res = hook_impl.function(*args)
  File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pytest_MPI/_plugin.py", line 43, in pytest_addoption
    parser.addoption(
  File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/_pytest/config/argparsing.py", line 96, in addoption
    self._anonymous.addoption(*opts, **attrs)
  File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/_pytest/config/argparsing.py", line 352, in addoption
    raise ValueError("option names %s already added" % conflict)
ValueError: option names {'--in_mpi_session'} already added

Is that because I screwed up my installation? Is it yet another bug?

I copied the example directly from your README.md.

NOhs commented 3 years ago

So I have it testing on OSX with OpenMPI and on Linux with OpenMPI and MPICH: https://travis-ci.org/github/NOhs/pytest-easyMPI

The .travis.yaml file in the root directory of this project tells you how I setup the venv and the package. Can you check if you do the same?

pancetta commented 3 years ago

Yeah, my installation was broken because of the name change. Now it's working for both MPI versions. Thanks!

NOhs commented 3 years ago

Glad it works! If you have any feature requests, just open a new issue.

NOhs / pytest-easyMPI

subprocess.CalledProcessError #2