openhpi2 / Open-HPI

Open HPI is an open source implementation of the SA Forum's Hardware Platform Interface (HPI). HPI provides an abstracted interface to managing computer hardware, typically for chassis and rack based servers
Other
3 stars 1 forks source link

openhpid refuses new connections after some time #2529

Open openhpi2 opened 10 years ago

openhpi2 commented 10 years ago

I am experiencing a problem where openhpid becomes unresponsive to all client connections. It returns SA_ERR_HPI_NO_RESPONSE for all API connections. This is in both 3.2.1 and 3.4.0. This is using the IPMI direct plugin.

Here is what happens:

We have a rogue process that connects to openhpid via the C API.  Then it crashes, and starts up again 1 second later. Crashes, starts up, etc.

This is causing openhpid to not release socket descriptors.

An "lsof -p" shows 1024 socket descriptors stuck in CLOSE_WAIT.  (1024 is the max file descriptor limit for this user on this machine.)

When we get into this situation I sent openhpid an ABRT signal, and there are 1024 threads most all of which are blocked on:

(gdb) bt

0 0x00007f13236b41eb in pthread_cond_timedwait@@GLIBC_2.3.2 ()

from /lib64/libpthread.so.0

1 0x00007f1323dc74c5 in ?? () from /usr/lib64/libgthread-2.0.so.0

2 0x00007f13238e0ebf in ?? () from /usr/lib64/libglib-2.0.so.0

3 0x00007f13238e1711 in g_async_queue_timed_pop ()

from /usr/lib64/libglib-2.0.so.0

4 0x0000000000424c0f in oh_dequeue_session_event ()

5 0x00000000004197a4 in saHpiEventGet ()

6 0x000000000040b820 in servicethread(void, void_) ()

7 0x00007f13239342d8 in ?? () from /usr/lib64/libglib-2.0.so.0

8 0x00007f1323931db6 in ?? () from /usr/lib64/libglib-2.0.so.0

9 0x00007f13236aff05 in start_thread () from /lib64/libpthread.so.0

10 0x00007f1322d8210d in clone () from /lib64/libc.so.6

Attached to this bug is /var/log/messages with "openhpid -v". The problem starts to happen at 18:43:08. openhpid.bug.txt.gz

Reported by: trguitar

openhpi2 commented 9 years ago

Original comment by: dr_mohan