Closed jcphill closed 8 years ago
Original date: 2015-12-17 14:56:24
Build from Dec 11 (v6.7.0-rc2-0-g7008690) works fine.
Original date: 2015-12-17 16:59:59
Root cause addressed as "Bug #927: ibverbs broken by undef QLOGIC" so can wait for 6.7.1.
Original date: 2016-01-28 20:43:50
This is partially resolved by the CkEnforce commit, but proper error messages are needed - now issue #960.
Original date: 2016-02-25 19:49:27
I think we can close this now, since the hangs themselves have been addressed.
Original date: 2016-03-08 20:18:19
Bilge, can you confirm that we can close this?
Original date: 2016-03-09 02:01:17
Yes, i think we can close it now too.
Original issue: https://charm.cs.illinois.edu/redmine/issues/926
On Stampede: Info: Startup phase 0 took 0.00885892 s, 237.66 MB of memory in use [0] wc[0] status 12 wc[i].opcode 0 [14] wc[0] status 12 wc[i].opcode 0 [1] wc[0] status 12 wc[i].opcode 0 ...hangs...
I see in src/arch/verbs/machine-ibverbs.c the following: if(wc[i].status != IBV_WC_SUCCESS){ printf("[%d] wc[%d] status %d wc[i].opcode %d\n",CmiMyNodeGlobal(),i,wc[i].status,wc[i].opcode);
if CMK_IBVERBS_STATS
endif
Since CmiAssert compiles to null in production mode it accomplishes nothing (or wastes computer time in this case). Assertions are for catching bugs, not failures!