Closed pancetta closed 3 years ago
Aha, so this is not a conda thing nor a general MPICH/OpenMPI one, but rather macOS? Then this is probably a not-so-relevant edge case as most users do not run stuff on macOS, I'd say.
Not sure. The open-mpi community seemed quite interested in getting the issues with open-mpi hanging on osx fixed. So there might still be hope.
The issue should be fixed on branch: https://github.com/NOhs/pytest-MPI/tree/fix_subprocess_mpi_hang
Can you verify?
Holy guacamole! You made it! It works with MPICH now, both for assert True
and for assert False
. Thanks!
With OpenMPI I get an UnboundLocalError:
======================================================================================================================= test session starts ========================================================================================================================
platform darwin -- Python 3.8.5, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
rootdir: /Users/robert/Documents/codes/performance
plugins: MPI-0.0.1.dev6
collected 2 items
test_mpi.py .F [100%]
============================================================================================================================= FAILURES =============================================================================================================================
__________________________________________________________________________________________________________________________ test_parallel ___________________________________________________________________________________________________________________________
args = (), kwargs = {}, executable = 'mpirun', test_name = 'test_mpi.py::test_parallel', failed = True, errors = [], i = 1, file_name = 'test_mpi.py..test_parallel_1'
@functools.wraps(func)
def replacement_func(*args, **kwargs):
# __tracebackhide__ = True
if not in_mpi_session():
executable = mpi_executable(mpi_executable_name)
test_name = get_pytest_input(func)
failed = False
try:
mpi_subprocess_output = subprocess.check_output(
[
executable,
"-np",
str(nprocs),
sys.executable,
"-m",
"mpi4py",
"-m",
"pytest_MPI._print_capture",
test_name,
],
stderr=subprocess.STDOUT,
)
except subprocess.CalledProcessError as error:
failed = True
# print(error.output.decode('utf-8'))
errors = []
for i in range(nprocs):
file_name = f"{get_filename(test_name)}_{i}"
if os.path.isfile(file_name):
with open(file_name) as f:
rank_output = f.read()
os.remove(file_name)
if not contains_failure(rank_output):
continue
errors.append((i, rank_output))
for rank, message in errors:
header_1 = f"Rank {rank}"
header_2 = f" reported an error:"
header = f"{Style.BRIGHT}{Fore.RED}{header_1}{Style.RESET_ALL}{header_2}"
print("\n" + header)
print("- " * (len(header_1 + header_2)//2 + 1))
print(get_traceback(message))
if failed:
> pytest.fail(get_summary(message))
E UnboundLocalError: local variable 'message' referenced before assignment
../../../miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pytest_MPI/_decorator.py:134: UnboundLocalError
===================================================================================================================== short test summary info ======================================================================================================================
FAILED test_mpi.py::test_parallel - UnboundLocalError: local variable 'message' referenced before assignment
=================================================================================================================== 1 failed, 1 passed in 0.08s ====================================================================================================================
Hm. Did you add the environment variable again, the export PMIX_MCA_gds=hash
stuff?
No, but now I did, no change.
I added better output for that case. Can you see what it prints?
I don't see where you initialize message
. It looks like it is only defined in the loop here.
Well if message is not defined, then failed should not be true. Since that might be wishful thinking, I adapted the branch a bit.
Better, but now:
FAILED test_mpi.py::test_parallel - UnboundLocalError: local variable 'error' referenced before assignment
And now?
Well..
if failed:
if errors:
pytest.fail(get_summary(message))
else:
> pytest.fail(alternative_output)
E TypeError: Failed expected string as 'msg' parameter, got 'bytes' instead.
E Perhaps you meant to use a mark?
../../../miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pytest_MPI/_decorator.py:137: TypeError
===================================================================================================================== short test summary info ======================================================================================================================
FAILED test_mpi.py::test_parallel - TypeError: Failed expected string as 'msg' parameter, got 'bytes' instead.
=================================================================================================================== 1 failed, 1 passed in 0.09s ====================================================================================================================
Urgh, this is not the most effective way of debugging :D, sec
I feel like SLURM.. but that's ok, it's you helping me.
It should print something more useful now.
Meh..
if failed:
if errors:
pytest.fail(get_summary(message))
else:
> pytest.fail(alternative_output)
E Failed: <Failed instance>
../../../miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pytest_MPI/_decorator.py:138: Failed
===================================================================================================================== short test summary info ======================================================================================================================
FAILED test_mpi.py::test_parallel - Failed: <Failed instance>
=================================================================================================================== 1 failed, 1 passed in 0.08s ====================================================================================================================
I see the same issue on Linux with openmpi. I suggest moving forward with MPICH for the time being until I figure out the issue with openmpi.
OK, sounds good. Thank you for your extensive help.
The latest push to the master branch together with the change to the given example (note the import moved into the MPI function) should fix the issues mentioned here.
FYI I had to change the package name to avoid a name conflict with an existing package.
Hmm.. I get this error now, both with OpenMPI and MPICH:
Traceback (most recent call last):
File "/Users/robert/miniconda3/envs/performance_openmpi/bin/pytest", line 8, in <module>
sys.exit(console_main())
File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/_pytest/config/__init__.py", line 187, in console_main
code = main()
File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/_pytest/config/__init__.py", line 143, in main
config = _prepareconfig(args, plugins)
File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/_pytest/config/__init__.py", line 318, in _prepareconfig
config = pluginmanager.hook.pytest_cmdline_parse(
File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/hooks.py", line 286, in __call__
return self._hookexec(self, self.get_hookimpls(), kwargs)
File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/manager.py", line 93, in _hookexec
return self._inner_hookexec(hook, methods, kwargs)
File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/manager.py", line 84, in <lambda>
self._inner_hookexec = lambda hook, methods, kwargs: hook.multicall(
File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/callers.py", line 203, in _multicall
gen.send(outcome)
File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/_pytest/helpconfig.py", line 100, in pytest_cmdline_parse
config = outcome.get_result() # type: Config
File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/callers.py", line 80, in get_result
raise ex[1].with_traceback(ex[2])
File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/callers.py", line 187, in _multicall
res = hook_impl.function(*args)
File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/_pytest/config/__init__.py", line 1003, in pytest_cmdline_parse
self.parse(args)
File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/_pytest/config/__init__.py", line 1280, in parse
self._preparse(args, addopts=addopts)
File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/_pytest/config/__init__.py", line 1172, in _preparse
self.pluginmanager.load_setuptools_entrypoints("pytest11")
File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/manager.py", line 300, in load_setuptools_entrypoints
self.register(plugin, name=ep.name)
File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/_pytest/config/__init__.py", line 436, in register
ret = super().register(plugin, name) # type: Optional[str]
File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/manager.py", line 127, in register
hook._maybe_apply_history(hookimpl)
File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/hooks.py", line 333, in _maybe_apply_history
res = self._hookexec(self, [method], kwargs)
File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/manager.py", line 93, in _hookexec
return self._inner_hookexec(hook, methods, kwargs)
File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/manager.py", line 84, in <lambda>
self._inner_hookexec = lambda hook, methods, kwargs: hook.multicall(
File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/callers.py", line 208, in _multicall
return outcome.get_result()
File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/callers.py", line 80, in get_result
raise ex[1].with_traceback(ex[2])
File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pluggy/callers.py", line 187, in _multicall
res = hook_impl.function(*args)
File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/pytest_MPI/_plugin.py", line 43, in pytest_addoption
parser.addoption(
File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/_pytest/config/argparsing.py", line 96, in addoption
self._anonymous.addoption(*opts, **attrs)
File "/Users/robert/miniconda3/envs/performance_openmpi/lib/python3.8/site-packages/_pytest/config/argparsing.py", line 352, in addoption
raise ValueError("option names %s already added" % conflict)
ValueError: option names {'--in_mpi_session'} already added
Is that because I screwed up my installation? Is it yet another bug?
I copied the example directly from your README.md.
So I have it testing on OSX with OpenMPI and on Linux with OpenMPI and MPICH: https://travis-ci.org/github/NOhs/pytest-easyMPI
The .travis.yaml
file in the root directory of this project tells you how I setup the venv and the package. Can you check if you do the same?
Yeah, my installation was broken because of the name change. Now it's working for both MPI versions. Thanks!
Glad it works! If you have any feature requests, just open a new issue.
Great approach, exactly what I would need. However, after installing it I cannot get it to work. Even with a
I get subprocess.CalledProcessError, saying that the command returned the error code 15. When I run the command by hand on the command line, all works well. Could be an environment thing? I've no idea how to debug or even fix that.
I'm running on macOS with Python 3.8 from miniconda. Any help is appreciated!