charmplusplus / charm

The Charm++ parallel programming system. Visit https://charmplusplus.org/ for more information.
Apache License 2.0
206 stars 50 forks source link

user-driven-interop fails at exit on mpi-linux-* #3491

Open nitbhat opened 3 years ago

nitbhat commented 3 years ago

This seems like a non-recurring bug that I ran into once but didn't see during the second rerun of the Travis CI.

make[3]: Entering directory '/home/travis/build/UIUC-PPL/charm/mpi-linux-ppc64le/examples/charm++/user-driven-interop'
../../../bin/testrun   +p2 ./hello_user 8  
Running as 2 OS processes: ./hello_user 8
charmrun> /usr/bin/setarch ppc64le -R mpirun -np 2 ./hello_user 8
Charm++> Running in non-SMP mode: 2 processes (PEs)
Converse/Charm++ Commit ID: 8430543f7
Charm++ built with internal error checking enabled.
Do not use for performance benchmarking (build without --enable-error-checking to do so).
Isomalloc> Synchronized global address space.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 hosts (2 sockets x 1 cores x 8 PUs = 16-way SMP)
Charm++> cpu topology info is gathered in 0.000 seconds.
Chare 0 created on PE 0
Chare 1 created on PE 0
Chare 2 created on PE 0
Chare 3 created on PE 0
Starting in user driven mode on 0
Hello from chare 0
Hello from chare 1
Hello from chare 2
Hello from chare 3
Chare 4 created on PE 1
Chare 5 created on PE 1
Chare 6 created on PE 1
Chare 7 created on PE 1
Starting in user driven mode on 1
Hello from chare 4
Hello from chare 5
Hello from chare 6
Hello from chare 7
Chare 0 got an ack from 0
Chare 1 got an ack from 0
Chare 2 got an ack from 0
Chare 3 got an ack from 0
Chare 0 got an ack from 1
Chare 1 got an ack from 1
Chare 2 got an ack from 1
Chare 3 got an ack from 1
Chare 4 got an ack from 0
Chare 5 got an ack from 0
Chare 6 got an ack from 0
Chare 7 got an ack from 0
Chare 4 got an ack from 1
Chare 5 got an ack from 1
Chare 6 got an ack from 1
Chare 7 got an ack from 1
Attempting to use an MPI routine after finalizing MPICH
real    0m0.042s
user    0m0.026s
sys 0m0.038s
make[3]: *** [Makefile:36: test] Error 1
make[2]: *** [Makefile:71: test-user-driven-interop] Error 2
make[1]: *** [Makefile:34: test-charm++] Error 2
make: *** [Makefile.tests.common:40: test] Error 2
[Partition 0][Node 0] End of program
make[3]: Leaving directory '/home/travis/build/UIUC-PPL/charm/mpi-linux-ppc64le/examples/charm++/user-driven-interop'
make[2]: Leaving directory '/home/travis/build/UIUC-PPL/charm/mpi-linux-ppc64le/examples/charm++'
make[1]: Leaving directory '/home/travis/build/UIUC-PPL/charm/mpi-linux-ppc64le/examples'
make: Leaving directory '/home/travis/build/UIUC-PPL/charm/mpi-linux-ppc64le/include'
The command "make -C mpi-linux-ppc64le/tmp test" exited with 2.
evan-charmworks commented 1 year ago

I can reproduce this even on mpi-linux-x86_64. See https://github.com/UIUC-PPL/charm/pull/3477#issuecomment-1537559759