Unidata / LDM

The Unidata Local Data Manager (LDM) system includes network client and server programs designed for event-driven data distribution, and is the fundamental component of the Unidata Internet Data Distribution (IDD) system.
43 stars 27 forks source link

Waiting for the LDM server to terminate... #64

Closed mjames-upc closed 5 years ago

mjames-upc commented 6 years ago

This has been a long-standing problem with the LDM in the AWIPS project... attempting to stop the ldm results in this message posted repeatedly, sometimes for 10 or more minutes

[root@edextest ~]# ldmadmin stop
Stopping the LDM server...
Waiting for the LDM server to terminate...
Waiting for the LDM server to terminate...
Waiting for the LDM server to terminate...
Waiting for the LDM server to terminate...
Waiting for the LDM server to terminate...
Waiting for the LDM server to terminate...
Waiting for the LDM server to terminate...
Waiting for the LDM server to terminate...
Waiting for the LDM server to terminate...
Waiting for the LDM server to terminate...
Waiting for the LDM server to terminate...
Waiting for the LDM server to terminate...
Waiting for the LDM server to terminate...
Waiting for the LDM server to terminate...
Waiting for the LDM server to terminate...
Waiting for the LDM server to terminate...
Waiting for the LDM server to terminate...
Waiting for the LDM server to terminate...
Waiting for the LDM server to terminate...
Waiting for the LDM server to terminate...
Waiting for the LDM server to terminate...
Waiting for the LDM server to terminate...
Waiting for the LDM server to terminate...
Waiting for the LDM server to terminate...
semmerson commented 6 years ago

Can you duplicate the problem? If so, this should allow us to determine what processes are hung.

mjames-upc commented 6 years ago

Yeah this is reproducible, encountered nearly every time the LDM is stopped after running some amount of time on an EDEX server.

semmerson commented 6 years ago

What's the minimum amount of time?

What's the output of the command ps -ef | grep ^ldm while this is happening?

mjames-upc commented 6 years ago

Using ps -ef |grep ldm since ldmd is run by user awips

awips    10649     1  0 17:53 ?        00:00:00 ldmd -I -P 388 -M 256 -m 3600 -o 3600 -q /awips2/ldm/var/queues/ldm.pq /awips2/ldm/etc/ldmd.conf
awips    10652 10649  0 17:53 ?        00:00:00 edexBridge -vxl /awips2/ldm/logs/edexBridge.log -s localhost
root     11217  2915  0 17:56 pts/1    00:00:00 /bin/sh /sbin/service edex_ldm stop
root     11224 11217  0 17:56 pts/1    00:00:00 /bin/bash /etc/init.d/edex_ldm stop
root     11229 11224  0 17:56 pts/1    00:00:00 su awips -lc ldmadmin stop
awips    11230 11229  1 17:56 ?        00:00:00 /usr/bin/perl /awips2/ldm/bin/ldmadmin stop
mjames-upc commented 6 years ago

As far as the minimum amount of time, I guess 20-30 seconds, but more often than under a minute I see 5-10 minute shutdown times with the message "Waiting for the LDM server to terminate"

semmerson commented 6 years ago

I think there's a misunderstanding. I would like to know how long the LDM must run before an ldmadmin stop take a long time to executed -- not how long the ldmadmin stop takes to execute.

The ps(1) output shows the ldmadmin stop process, and the top-level LDM server process. The other processes are associated with the EDEX bridge. I suspect, therefore, that the problem lies with the EDEX bridge: those processes aren't responding quickly enough.

sebenste commented 5 years ago

I can definitively say that this issue has been corrected in LDM 6.13.11. Give it a try.