Middleware that provides libraries, GUI, and code generator to design multi-node (clustered) applications that are highly available, redundant, and scalable. Provides sub-second node and application fault detection and failover, and useful application libraries including distributed hash tables (checkpoint), event, logging, and communications. Implements SA-Forum APIs where applicable. Used anywhere reliability is a must -- like telecom, wireless, defense and enterprise computing. Download stable release with installer from: ftp.openclovis.com
AMF healthcheck timer delete should be using clTimerDeleteAsync instead of clTimerDelete as its done with cpmMutex lock held also grabbed on healthcheck timer callback.
Otherwise we can deadlock is the healthcheck timer delete and the timer callback fire at the same time in which case, both would deadlock on the cpmMutex considering the synchronous clTimerDelete call would wait for any running callbacks to finish. And they can't finish as the clTimerDelete context would have grabbed the same mutex before trying to delete the healthcheck timer.
One such deadlock occurrence --
(gdb) thr 2
[Switching to thread 2 (Thread 0x7f4ff317f700 (LWP 16943))]
0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:132
132 ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: No such file or directory.
AMF healthcheck timer delete should be using clTimerDeleteAsync instead of clTimerDelete as its done with cpmMutex lock held also grabbed on healthcheck timer callback. Otherwise we can deadlock is the healthcheck timer delete and the timer callback fire at the same time in which case, both would deadlock on the cpmMutex considering the synchronous clTimerDelete call would wait for any running callbacks to finish. And they can't finish as the clTimerDelete context would have grabbed the same mutex before trying to delete the healthcheck timer. One such deadlock occurrence --
(gdb) thr 2
[Switching to thread 2 (Thread 0x7f4ff317f700 (LWP 16943))]
0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:132
132 ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: No such file or directory.
(gdb) p *mutex
$1 = {data = {lock = 2, count = 0, owner = 17356, nusers = 1, kind = 0, spins = 0, list = {prev = 0x0, next = 0x0}},
size = "\002\000\000\000\000\000\000\000\314C\000\000\001", '\000' <repeats 26 times>, align = 2}
(gdb) bt
(gdb) thr 28
[Switching to thread 28 (Thread 0x7f4ff1a41700 (LWP 17356))]
0 0x00007f4ff62dd52d in nanosleep () at ../sysdeps/unix/syscall-template.S:82
82 ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) bt
0 0x00007f4ff62dd52d in nanosleep () at ../sysdeps/unix/syscall-template.S:82
1 0x00007f4ff47e5a5f in cosPosixTaskDelay (timer=...) at posix/clCommonCos.c:805
2 0x00007f4ff47f8676 in clOsalTaskDelay (timer=timer@entry=...) at osal.c:270
3 0x00007f4ff482110c in timerDeleteLocked (pTimer=pTimer@entry=0x7f4fc4035e18, pTimerHandle=pTimerHandle@entry=0x1f85508, asyncFlag=asyncFlag@entry=0,
4 0x00007f4ff4821f88 in timerDelete (pTimerHandle=0x1f85508, asyncFlag=asyncFlag@entry=0) at clTimerTree.c:943
5 0x00007f4ff4823257 in clTimerDelete (pTimerHandle=) at clTimerTree.c:962
6 0x00007f4ff54a81bf in cpmCompHealthcheckStop (pCompName=pCompName@entry=0x7f4fc008344c) at clCpmComponent.c:4986
7 0x00007f4ff574d69f in clAmsPeCompAssignCSITimeout (timer=) at clAmsPolicyEngine.c:16279
8 0x00007f4ff5714827 in clAmsEntityTimeout (timer=0x7f4fc0083a68) at clAmsEntities.c:4636
9 0x00007f4ff482236a in clTimerCallbackTask (invocation=invocation@entry=0x7f4fbc00e878) at clTimerTree.c:1121
10 0x00007f4ff4861392 in clTaskPoolEntry (pArg=) at clTaskPool.c:277
11 0x00007f4ff47e31e5 in cosPosixTaskWrapper (pArgument=) at posix/clCommonCos.c:951
12 0x00007f4ff62d5e9a in start_thread (arg=0x7f4ff1a41700) at pthread_create.c:308
13 0x00007f4ff3a95cbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
14 0x0000000000000000 in ?? ()
0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:132
1 0x00007f4ff62d8065 in _L_lock_858 () from /lib/x86_64-linux-gnu/libpthread.so.0
2 0x00007f4ff62d7eba in __pthread_mutex_lock (mutex=0x1f3d780) at pthread_mutex_lock.c:61
3 0x00007f4ff47efd8e in __cosMutexLock (mutexId=, verbose=) at posix/clLinux.c:329
4 0x00007f4ff47fc136 in clOsalMutexLock (mutexId=0x1f3d778) at osal.c:532
5 0x00007f4ff57afc3d in __clAmsMgmtEntityGetConfig (in=, out=0x7f4fd4005780, versionCode=327680,
6 0x00007f4ff4814bdc in clRmdInvoke (func=0x7f4ff57c6920 <_clAmsMgmtEntityGetConfig_5_0_0>, eoArg=0x0, inMsgHdl=0x7f4f88002d00, outMsgHdl=0x7f4fd4005780) at clRmdHandle.c:138
7 0x00007f4ff470930c in clEoWalkWithVersion (pThis=pThis@entry=0x1eb1578, func=680, version=version@entry=0x7f4ff317e800, pFuncCallout=,
8 0x00007f4ff48181aa in rmdHandleSyncRequest (pThis=pThis@entry=0x1eb1578, pReq=pReq@entry=0x7f4ff317e960, srcAddr=srcAddr@entry=0x7f4ff317e940,
9 0x00007f4ff4818ae9 in clRmdReceiveRequest (pThis=0x1eb1578, rmdRecvMsg=0x7f4f88002d00, priority=0 '\000', protoType=, length=, srcAddr=...)
10 0x00007f4ff470270c in clEoJobHandler (pJob=pJob@entry=0x7f4fd40008f8) at eo.c:3861
11 0x00007f4ff4861392 in clTaskPoolEntry (pArg=) at clTaskPool.c:277
12 0x00007f4ff47e31e5 in cosPosixTaskWrapper (pArgument=) at posix/clCommonCos.c:951
13 0x00007f4ff62d5e9a in start_thread (arg=0x7f4ff317f700) at pthread_create.c:308
14 0x00007f4ff3a95cbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
15 0x0000000000000000 in ?? ()
(gdb)