Closed yurivict closed 1 year ago
how did you run the test? did you run the test with 2 processes, as required (or 3, if using MPI-PR)?
Tests are run with the make check
command.
what was MPIEXEC set to when you ran make check
? can you provide the total output, including the log files?
MPIEXEC=/usr/local/bin/mpiexec
which is mpich-3.4.3
Output:
gmake check-TESTS
gmake[5]: Entering directory '/usr/ports/devel/ga/work/ga-5.8.2/comex'
gmake[6]: Entering directory '/usr/ports/devel/ga/work/ga-5.8.2/comex'
FAIL: testing/perf
Assertion failed: (nproc == 2), function main, file testing/perf.c, line 42.
gmake[6]: *** [Makefile:1775: testing/perf.log] Abort trap
What log files can I provide?
./comex/testing/perf.log
./comex/config.log
./armci/config.log
./config.log
I cannot reproduce on MacOS with ../configure MPICC=mpicc MPICXX=mpicxx MPIF77=mpifort --with-mpi-pr && make -j8 && make -j8 checkprogs && make check
using MPICH 4.1.2. I know BSD is different, but not in ways that should matter here.
comex/testing/perf.log file doesn't exist.
okay, can you try running it manually with /usr/local/bin/mpiexec -n 2 ./comex/testing/perf.x
or similar?
There's no file ./comex/testing/perf.x
.
But /usr/local/bin/mpiexec -n 2 ./comex/testing/perf
runs through successfully.
There's no file
./comex/testing/perf.x
.But
/usr/local/bin/mpiexec -n 2 ./comex/testing/perf
runs through successfully.
That happens because the test runs only with nproc=2
as the original error clearly states
https://github.com/GlobalArrays/ga/issues/312#issuecomment-1651161830
Assertion failed: (nproc == 2), function main, file testing/perf.c, line 42.
https://github.com/GlobalArrays/ga/blob/56087b52459e311e49fc05d9708329a39b776549/comex/testing/perf.c#L42
In other words, you need to set MPIEXEC="mpiexec -np 2"
@edoapra
The problem is that setting MPIEXEC="mpiexec -np 2"
from outside of build or test (make check
) commands doesn't change anything. This is a problem in the project's makefiles that MPIEXEC
isn't used.
In fact, comex/Makefile
already defines MPIEXEC = /usr/local/bin/mpirun -n %NP%
but it somehow doesn't work.
I only call make check
from the port, and it always fails.
@yurivict I am assuming this is what you are getting
$ uname -a
FreeBSD freebsd 13.2-RELEASE FreeBSD 13.2-RELEASE releng/13.2-n254617-525ecfdad597 GENERIC amd64
[edo@freebsd ~/ga-build]$ make check
make check-recursive
Making check in comex
make check-am
make testing/perf testing/perf_amo testing/perf_contig testing/perf_strided testing/shift testing/test
`testing/perf' is up to date.
`testing/perf_amo' is up to date.
`testing/perf_contig' is up to date.
`testing/perf_strided' is up to date.
`testing/shift' is up to date.
`testing/test' is up to date.
make check-TESTS
FAIL: testing/perf
^C*** Error code 130
*** Signal 2
*** Signal 2
*** Signal 2
*** Signal 2
*** Signal 2
[edo@freebsd ~/ga-build]$ tail comex/testing/perf.log
Assertion failed: (nproc == 2), function main, file ../../ga/comex/testing/perf.c, line 42.
Assertion failed: (nproc == 2), function main, file ../../ga/comex/testing/perf.c, line 42.
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node freebsd exited on signal 6 (Abort trap).
--------------------------------------------------------------------------
@yurivict Look at what happens when I specify
make check MPIEXEC='/usr/local/bin/mpirun -np 2'
[edo@freebsd ~/ga-build]$ uname -a
FreeBSD freebsd 13.2-RELEASE FreeBSD 13.2-RELEASE releng/13.2-n254617-525ecfdad597 GENERIC amd64
[edo@freebsd ~/ga-build]$ make check MPIEXEC='/usr/local/bin/mpirun -np 2'
make check-recursive
Making check in comex
make check-am
make testing/perf testing/perf_amo testing/perf_contig testing/perf_strided testing/shift testing/test
`testing/perf' is up to date.
`testing/perf_amo' is up to date.
`testing/perf_contig' is up to date.
`testing/perf_strided' is up to date.
`testing/shift' is up to date.
`testing/test' is up to date.
make check-TESTS
PASS: testing/perf
[edo@freebsd ~/ga-build]$ head comex/testing/perf.log
PASS: testing/perf (exit: 0)
============================
msg size (bytes) avg time (us) avg b/w (MB/sec)
#PNNL comex Put Test
16 385.600000 0.041494
32 388.000000 0.082474
64 387.310000 0.165242
128 386.970000 0.330775
256 411.100000 0.622720
@edoapra
I run from the port's directory.
All the same command line arguments are set there now, but the tests fail:
cd /usr/ports/devel/ga && make test
(to check out the ports tree: sudo git clone https://git.FreeBSD.org/ports.git /usr/ports
)
Could you post the actual log of the command cd /usr/ports/devel/ga && make test
?
It does not look right to me (unless I am not reading the log correctly)
MPIEXEC='/usr/local/bin/mpirun -np 2'
needs to be an argument to gmake
, not an environment variable.
gmake MPIEXEC='/usr/local/bin/mpirun -np 2'
MPIEXEC='/usr/local/bin/mpirun -np 2'
is in TEST_ARGS
which is arguments to gmake.
$ cd /usr/ports/devel/ga
edit Makefile, since I don't want to redo autoreconf
$ diff -u Makefile.org Makefile
--- Makefile.org 2023-07-27 11:01:11.978246000 -0700
+++ Makefile 2023-07-27 10:57:33.609524000 -0700
@@ -13,7 +13,8 @@
liblapack.so:math/lapack \
libscalapack.so:math/scalapack
-USES= autoreconf fortran gmake libtool localbase
+#USES= autoreconf fortran gmake libtool localbase
+USES= fortran gmake
USE_LDCONFIG= yes
GNU_CONFIGURE= yes
Run make and the make test
$ make
$ make test
The output of ps wwww
looks promising
$ ps wwww|grep gmak
58458 1 T 0:00.06 gmake -f Makefile MPIEXEC=/usr/local/bin/mpiexec -np 2 check
57558 1 T 0:00.05 gmake check-recursive
57563 1 T 0:00.01 gmake check
57564 1 T 0:00.01 gmake check-am
58281 1 T 0:00.01 gmake check-TESTS
58284 1 T 0:00.01 gmake test-suite.log TEST_LOGS=testing/perf.log testing/perf_contig.log testing/perf_strided.log testing/perf_amo.log testing/shift.log testing/test.log
Here is the log for make
test
gmake check-TESTS
gmake[5]: Entering directory '/usr/ports/devel/ga/work/ga-5.8.2/comex'
gmake[6]: Entering directory '/usr/ports/devel/ga/work/ga-5.8.2/comex'
PASS: testing/perf
PASS: testing/perf_contig
Bottom line for me: please use the vanilla 5.8.2 tarball
Version: 5.8.2 clang-15 FreeBSD 13.2