sanity check all pickled non-interacting Hamiltonians in h0 directory

JohanSchott commented 4 months ago

Move sanity checking code of pickled Hamiltonian to a unit-test
Test all pickled Hamiltonians in h0 directory
Add sanity checks

JohanSchott commented 4 months ago

Thanks for your review @kalvdans !

I struggled to understand why the unit-tests failed in this PR. At first I also thought I could not reproduce the error locally, so I'm sorry about the many git pushes. But I can reproduce the problem locally, and I have figured out what the problem was.

MPI can't be run both in a parent process and a child process. In master no unit-test imports MPI stuff, except in a subprocess in impurityModel/test/test_comparison_with_reference.py. But in this PR add impurityModel/test/test_h0.py which imports MPI stuff with the line from impurityModel.ed.get_spectra import read_pickled_file. So when run pytest in this PR both the parent and a child process uses MPI stuff, which is not ok.

I the last commits I solve this issue by ignoring impurityModel/test/test_comparison_with_reference.py in pytest.ini and running pytest twice:

pytest impurityModel/test/test_comparison_with_reference.py, which uses MPI only in the child process (and not in parent process).
pytest, which uses MPI only in the parent process (no subprocess even present in these tests).

Small reproducer of the problem: python -c "from mpi4py import MPI; import subprocess; output = subprocess.run(args=['mpiexec -n 2 echo hello world'], shell=True, check=False, capture_output=True); print(output)"

But without from mpi4py import MPI it works fine, i.e.: python -c "import subprocess; output = subprocess.run(args=['mpiexec -n 2 echo hello world'], shell=True, check=False, capture_output=True); print(output)"

kalvdans commented 4 months ago

It seems MPI is automatically initialized when you import mpi4py.MPI (imports should never have side-effects in my opinion).

So just refrain from importing MPI until during runtime when you want to start parallellizing stuff. That also mean that you can't read the constant mpi4py.MPI.COMM_WORLD beforehand :(. And don't test MPI stuff using pytest since you can't restart MPI once you have stopped it...

$ python -c "from mpi4py import MPI; MPI.Finalize(); MPI.Init()"
*** The MPI_Init() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
[brahe:467487] Local abort after MPI_FINALIZE started completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!

JohanSchott commented 4 months ago

It would be good to test some MPI stuff but not sure how to do that.

I guess one way to test MPI stuff is something like:

def test_MPI_stuff_of_function_XXX():
    tot = []
    for ranks in [1,2,4,8]:
       subproces.run(f"mpiexec -n {ranks} python a_script_that_calls_function_XXX_and_save_its_output.py output_{ranks}", shell=True)
        tot.append(read_file(f"output_{ranks}"))
    compare_data(tot)

where read_file reads e.g. a pickle or hdf5 file. But it's not so elegant. And one python script for each unit-test...

kalvdans commented 4 months ago

I solve this issue by [...] running pytest twice

Seems it is the same solution pytest-mpi have chosen, and have two decorators to mark up which test should be run in which invocation.

JohanSchott / impurityModel

sanity check all pickled non-interacting Hamiltonians in h0 directory #52