Closed dschwoerer closed 4 years ago
I guess you did but can you confirm cmake "summary" says what's expected (i.e., cmake founds mpich
libs as you would have used in a hand-made Makefile)
Like so but with mpich (in case you have an environment problem [bashrc, several mpi instal, ...], you may don't fish what you expect)
>> cmake .
...
-- MPICC:
-- compile: /usr/lib/x86_64-linux-gnu/openmpi/include/openmpi
-- compile: /usr/lib/x86_64-linux-gnu/openmpi/include
-- link: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so
-- MPICXX:
-- compile: /usr/lib/x86_64-linux-gnu/openmpi/include/openmpi
-- compile: /usr/lib/x86_64-linux-gnu/openmpi/include
-- link: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi_cxx.so
-- link: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so
with mpich:
-- MPIFC:
-- compile: /usr/include/mpich-x86_64
-- compile: /usr/lib64/gfortran/modules/mpich
-- link: /usr/lib64/mpich/lib/libmpifort.so
-- link: /usr/lib64/mpich/lib/libmpi.so
-- MPICC:
-- compile: /usr/include/mpich-x86_64
-- link: /usr/lib64/mpich/lib/libmpi.so
-- MPICXX:
-- compile: /usr/include/mpich-x86_64
-- link: /usr/lib64/mpich/lib/libmpicxx.so
-- link: /usr/lib64/mpich/lib/libmpi.so
and with openmpi:
$ module switch mpi/mpich-x86_64 mpi/openmpi-x86_64
$ cmake -DEXAMPLES=ON -DMPI=ON -DICB=ON .. && make -j 4 && make test
...
-- MPIFC:
-- compile: /usr/include/mpich-x86_64
-- compile: /usr/lib64/gfortran/modules/mpich
-- link: /usr/lib64/mpich/lib/libmpifort.so
-- link: /usr/lib64/mpich/lib/libmpi.so
-- MPICC:
-- compile: /usr/include/mpich-x86_64
-- link: /usr/lib64/mpich/lib/libmpi.so
-- MPICXX:
-- compile: /usr/include/mpich-x86_64
-- link: /usr/lib64/mpich/lib/libmpicxx.so
-- link: /usr/lib64/mpich/lib/libmpi.so
(all tests pass)
$ rm -rf .* *
$ cmake -DEXAMPLES=ON -DMPI=ON -DICB=ON .. && make -j 4 && make test
-- MPIFC:
-- compile: /usr/include/openmpi-x86_64
-- compile: /usr/lib64/openmpi/lib
-- link: /usr/lib64/openmpi/lib/libmpi_usempif08.so
-- link: /usr/lib64/openmpi/lib/libmpi_usempi_ignore_tkr.so
-- link: /usr/lib64/openmpi/lib/libmpi_mpifh.so
-- link: /usr/lib64/openmpi/lib/libmpi.so
-- MPICC:
-- compile: /usr/include/openmpi-x86_64
-- link: /usr/lib64/openmpi/lib/libmpi.so
-- MPICXX:
-- compile: /usr/include/openmpi-x86_64
-- link: /usr/lib64/openmpi/lib/libmpi_cxx.so
-- link: /usr/lib64/openmpi/lib/libmpi.so
(all tests pass)
So the printed summary by cmake is wrong :-)
However, with openmpi it passes, with mpich it fails.
rm -rf .* *
only changes the summary from cmake, not the results.
The setup is default fedora setup, never had any issues with mpi ...
Just to make sure : if you rm CMakeCache.txt
after module switch mpi/mpich
, does it make mpich job succeed ? Cmake keep track of what was found previously in the cache : this may screw the build (switch but no rm cache). If you always build from scratch : this can not be your problem.
Sorry, should have been more clear.
I tried rm -rf .* *
(which should delete any cache) and the only thing that changes is the line written, but not result of the tests. They are the same: mpich fails and openmpi passes.
Sorry, should have looked at the log, and not just attach part of it -.-
Was rather trivial to fix (5c2a80e )
Maybe we should suggest to use grep Fail -B 100
rather then tail -n 300
?
They are the same: mpich fails and openmpi passes.
OK, so, mpich fails. If you could fix that, would be good you also PR a mpich-job in CI with the fix
Maybe we should suggest to use grep Fail -B 100 rather then tail -n 300?
Pros : make smaller logs. Cons : when problems show up on CI, having some context is really helpful.
If you "only" grep Fail -B 100
you could miss grep [Ee]rror -B 100
for instance. Maybe grepping last 50 lines of [Ff]ail, and, also grepping last 50 lines of [Ee]rror, and keeping tail but only the 100 or 50 last lines ? In case this is done, would be good to do that for all jobs in .travis.yml
Expected behavior
All tests pass
Actual behavior
90/90 Test #90: icb_parpack_cpp_tst .............. Passed 0.16 sec
91% tests passed, 8 tests failed out of 90
Total Test time (real) = 1.54 sec
The following tests FAILED: 72 - pcndrv1_ex (Failed) 73 - pdndrv1_ex (Failed) 74 - pdndrv3_ex (Failed) 75 - pdsdrv1_ex (Failed) 76 - psndrv1_ex (Failed) 77 - psndrv3_ex (Failed) 78 - pssdrv1_ex (Failed) 79 - pzndrv1_ex (Failed) Errors while running CTest
Where/how to reproduce the problem
Steps to reproduce the problem
Error message
see above
Traces
Callstack
n.a.
Notes, remarks
switching to openmpi resolves the issue