charmplusplus / charm

The Charm++ parallel programming system. Visit https://charmplusplus.org/ for more information.
Apache License 2.0
203 stars 49 forks source link

multicore-darwin-x86_64 megatest hangs when built with --enable-randomized-msgq --with-prio-type=int --enable-error-checking -debug #1520

Open jcphill opened 7 years ago

jcphill commented 7 years ago

Original issue: https://charm.cs.illinois.edu/redmine/issues/1520


multicore-darwin-x86_64 --no-build-shared --enable-randomized-msgq --with-prio-type=int --enable-error-checking -debug

jim`roswell%/Projects/namd2/charm-6.8.0-debug/multicore-darwin-x86_64/bin/megatest +p1
Charm++: standalone mode (not using charmrun)
Charm++> Running in Multicore mode:  1 threads
Converse/Charm++ Commit ID: v6.8.0-beta1-60-g4b5b444-namd-charm-6.8.0-debug-build-2017-Apr-20-133877
Warning> using Isomalloc in SMP mode, you may need to run with '+isomalloc_sync'.
Charm++> Using STL-based msgQ:
Charm++> Using randomized msgQ. Priorities will not be respected!
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 unique compute nodes (16-way SMP).
Charm++> cpu topology info is gathered in 0.000 seconds.
Megatest is running on 1 nodes 1 processors. 
test 0: initiated [groupring (milind)]
test 0: completed (0.00 sec)
test 1: initiated [nodering (milind)]
^C

--no-build-shared --enable-tracing --enable-tracing-commthread -optimize works fine.

--no-build-shared --with-production works fine.

netlrts-darwin-x86_64 works fine.

jcphill commented 5 years ago

Original date: 2017-04-20 16:21:52


Hangs in test 0 on multicore-linux64-iccstatic --no-build-shared --enable-randomized-msgq --with-prio-type=int --enable-error-checking -debug

jim`sunnyvale$/Projects/namd2/charm-6.8.0-debug/multicore-linux64-iccstatic/bin/megatest +p1
Charm++: standalone mode (not using charmrun)
Charm++> Running in Multicore mode:  1 threads
Converse/Charm++ Commit ID: v6.8.0-beta1-60-g4b5b444-namd-charm-6.8.0-debug-build-2017-Apr-20-133877
Warning> Randomization of stack pointer is turned on in kernel, thread migration may not work! Run 'echo 0 > /proc/sys/kernel/randomize_va_space' as root to disable it, or try run with '+isomalloc_sync'.  
Charm++> Using STL-based msgQ:
Charm++> Using randomized msgQ. Priorities will not be respected!
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 unique compute nodes (32-way SMP).
Charm++> cpu topology info is gathered in 0.001 seconds.
Megatest is running on 1 nodes 1 processors. 
test 0: initiated [completion_test (phil)]
^C

Also, why are the tests being run in reversed order on darwin vs linux?

stwhite91 commented 5 years ago

Original date: 2017-04-20 17:27:07


There are multiple other bugs in our test suite noted in issue #259.

stwhite91 commented 5 years ago

Original date: 2017-04-25 21:12:56


The other related test failures with randomized queues are targeted to 6.8.1.

jcphill commented 5 years ago

Original date: 2018-10-23 03:07:42


FYI, this is still broken.