charmplusplus / charm

The Charm++ parallel programming system. Visit https://charmplusplus.org/ for more information.
Apache License 2.0
202 stars 49 forks source link

examples/charm++/TRAM/randomAccessGroup crashes on mpi-win-x86_64-smp with debug options (-g -O0) #1959

Open nitbhat opened 6 years ago

nitbhat commented 6 years ago

Original issue: https://charm.cs.illinois.edu/redmine/issues/1959


Charm build command: ./build LIBS mpi-win-x86_64 smp --enable-error-checking --without-romio --suffix=debug -j8 -g -O0 |& tee build_result_debug

$ make test
../../../../bin/testrun  +p4 ./random_access 14

Running on 4 processors:  ./random_access 14
charmrun> /cygdrive/c/Program Files/Microsoft MPI/Bin/mpiexec -n 4  ./random_access 14

Charm++> Running on MPI version: 2.0
Charm++> level of thread support used: MPI_THREAD_FUNNELED (desired: MPI_THREAD_FUNNELED)
Charm++> Running in SMP mode: 4 processes, 1 worker threads (PEs) + 1 comm threads per process, 0 PEs total
Charm++> The comm. thread both sends and receives messages
Charm++ warning> fences and atomic operations not available in native assembly
Converse/Charm++ Commit ID: v6.8.2-853-g4146bf788
Charm++> Disabling isomalloc because mmap() does not work.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 hosts (1 sockets x 4 cores x 2 PUs = 8-way SMP)
Charm++> cpu topology info is gathered in 0.000 seconds.
Global table size   = 2^14 * 4 = 65536 words
Number of processors = 4
Number of updates = 262144
Aggregation topology: 4 1
CPU time used = 0.093000 seconds
0.002818755 Billion(10^9) Updates    per second [GUP/s]
0.000704689 Billion(10^9) Updates/PE per second [GUP/s]

job aborted:
[ranks] message

[0-2] terminated

[3] process exited without calling finalize

---- error analysis -----

[3] on CS-DEXTERITY
./random_access ended prematurely and may have crashed. exit code 0xc0000417

---- error analysis -----
make: *** [Makefile:28: test] Error 127
nitbhat commented 5 years ago

Original date: 2018-08-07 21:40:41


examples/charm++/TRAM/randomAccessArray also fails in a similar manner. However, aggregateRandomAccessArray and aggregateRandomAccessGroup run successfully.

Output from examples/charm++/TRAM/randomAccessArray crash:

$ make test
../../../../bin/charmc  randomAccess.ci
../../../../bin/charmc  randomAccess.C
randomAccess.C
../../../../bin/charmc  -language charm++ -o random_access randomAccess.o -module NDMeshStreamer
moduleinit3376.C
Ignored Unrecognized argument -Wl,--export-dynamic
../../../../bin/testrun  +p4 ./random_access 14 8

Running on 4 processors:  ./random_access 14 8
charmrun> /cygdrive/c/Program Files/Microsoft MPI/Bin/mpiexec -n 4  ./random_access 14 8

Charm++> Running on MPI version: 2.0
Charm++> level of thread support used: MPI_THREAD_FUNNELED (desired: MPI_THREAD_FUNNELED)
Charm++> Running in SMP mode: 4 processes, 1 worker threads (PEs) + 1 comm threads per process, 0 PEs total
Charm++> The comm. thread both sends and receives messages
Charm++ warning> fences and atomic operations not available in native assembly
Converse/Charm++ Commit ID: v6.8.2-853-g4146bf788
Charm++> Disabling isomalloc because mmap() does not work.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 hosts (1 sockets x 4 cores x 2 PUs = 8-way SMP)
Charm++> cpu topology info is gathered in 0.000 seconds.
Global table size   = 2^14 * 4 = 65536 words
Number of processors = 4
Number of updates = 262144
Aggregation topology: 4 1

job aborted:
[ranks] message

[0] process exited without calling finalize

[1-3] terminated

---- error analysis -----

[0] on CS-DEXTERITY
./random_access ended prematurely and may have crashed. exit code 0xc0000417

---- error analysis -----
make: *** [Makefile:28: test] Error 127
evan-charmworks commented 5 years ago

Original date: 2018-11-19 22:22:17


Was this fixed by #1960?