Monitoring handler processes left sleeping

jennifer-richards commented 6 years ago

After handling a monitoring request, subprocesses are being left in the sleeping state instead of being cleaned up.

This is similar to #64 / #4, except the processes are using no CPU.

Attaching to one with gdb shows that it is in the exit handler - this may be related to some exit problems with mech_eap.

(gdb) bt
#0  0x00007f82afa2db51 in futex_wait (private=<optimized out>, expected=12, futex_word=0x7f82980d6fcc)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:61
#1  0x00007f82afa2db51 in futex_wait_simple (private=<optimized out>, expected=12, futex_word=0x7f82980d6fcc)
    at ../sysdeps/nptl/futex-internal.h:135
#2  0x00007f82afa2db51 in __pthread_cond_destroy (cond=0x7f82980d6fa8) at pthread_cond_destroy.c:54
#3  0x00007f82a7c717ec in xmltooling::CondWaitImpl::~CondWaitImpl() () at /usr/lib/x86_64-linux-gnu/libxmltooling.so.7
#4  0x00007f82a7c68355 in xmltooling::ReloadableXMLFile::shutdown() () at /usr/lib/x86_64-linux-gnu/libxmltooling.so.7
#5  0x00007f82acecb68f in  () at /usr/lib/x86_64-linux-gnu/libshibsp.so.7
#6  0x00007f82ace1450e in shibsp::SPConfig::setServiceProvider(shibsp::ServiceProvider*) ()
    at /usr/lib/x86_64-linux-gnu/libshibsp.so.7
#7  0x00007f82ace15a68 in shibsp::SPConfig::term() () at /usr/lib/x86_64-linux-gnu/libshibsp.so.7
#8  0x00007f82ace16128 in shibsp::SPInternalConfig::term() () at /usr/lib/x86_64-linux-gnu/libshibsp.so.7
#9  0x00007f82ad1e0dee in shibresolver::ShibbolethResolver::term() () at /usr/lib/x86_64-linux-gnu/libshibresolver.so.1
#10 0x00007f82ad63c14c in  () at /usr/lib/x86_64-linux-gnu/gss/mech_eap.so
#11 0x00007f82af67bec0 in __run_exit_handlers (status=status@entry=0, listp=0x7f82afa1a6f8 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:83
#12 0x00007f82af67bf1a in __GI_exit (status=status@entry=0) at exit.c:105
#13 0x000055a0cbdcf3d0 in mons_accept (mons=0x55a0cd7d5b80, listen=8) at mon/mons.c:258
#14 0x00007f82b090e6aa in  () at /usr/lib/x86_64-linux-gnu/libevent-2.1.so.6
#15 0x00007f82b090f227 in event_base_loop () at /usr/lib/x86_64-linux-gnu/libevent-2.1.so.6
#16 0x000055a0cbdafdee in main (argc=<optimized out>, argv=<optimized out>) at tr/tr_main.c:321
(gdb) frame 13
#13 0x000055a0cbdcf3d0 in mons_accept (mons=0x55a0cd7d5b80, listen=8) at mon/mons.c:258
warning: Source file is more recent than executable.
258     exit(0); /* exit to kill forked child process */

jennifer-richards commented 6 years ago

I believe this is caused by interplay between threading and forking in the main process. Disabling the TRP threads (or keeping them from being started by turning off peering) seems to eliminate this problem. The monitoring processes exit as expected.

jennifer-richards commented 6 years ago

I've reported the underlying problem as #85. As an immediate workaround, replacing our exit() calls with abort() in the TID and monitoring processes seems to prevent sleeping processes from hanging around. Given that, I am lowering the priority, but this is not a pleasant situation so I am leaving this open.

painless-security / trust-router

Monitoring handler processes left sleeping #84