Closed ChrisHeitkamp closed 8 years ago
Same issue here. Using Ubuntu 15.10, no SCOM management groups connected.
omsagent 3200 0.0 0.1 355952 10560 ? Sl 22:44 0:01 /opt/omi/bin/omiagent 10 13 --destdir / --providerdir /opt/omi/lib --idletimeout 90 --loglevel WARNING
omsagent 3428 0.0 0.0 0 0 ? Z 22:48 0:00 [python]
Re-installed OMS agent without success. 5min after installation, first zombie processes show up. Is this related to the killing of omsagent every 5min?
Feb 22 22:48:50 donald systemd[1]: omsagent.service: Main process exited, code=killed, status=9/KILL Feb 22 22:53:52 donald systemd[1]: omsagent.service: Main process exited, code=killed, status=9/KILL Feb 22 22:58:55 donald systemd[1]: omsagent.service: Main process exited, code=killed, status=9/KILL Feb 22 23:03:57 donald systemd[1]: omsagent.service: Main process exited, code=killed, status=9/KILL Feb 22 23:08:59 donald systemd[1]: omsagent.service: Main process exited, code=killed, status=9/KILL Feb 22 23:14:02 donald systemd[1]: omsagent.service: Main process exited, code=killed, status=9/KILL Feb 22 23:19:04 donald systemd[1]: omsagent.service: Main process exited, code=killed, status=9/KILL
Seems as if the zombie processes result from client.sh script
omsagent 4443 0.0 0.0 0 0 ? Z 23:03 0:00 [python]
any update on this?
We're testing a new kit to resolve this problem. We'll loop back when we know we have the problem fixed and have something to post.
Thanks for your patience.
Hi @sjohner,
we have verified the resolution of this issue with a private build. This fix is included in the next release, and we are happy to distribute a private to you if needed.
Since this issue is resolved, I'm going to go ahead and close it.
@sjohner: If you need a private, let Anurag know.
Great, thanks you guys!
@agup006 can you make any comment about the timeline of the next release?
Problem not resolved yet. I have installed python 2.7.15 (most updated) and OMSAgent-1.4.1. Still zombie process against omsagent stick with in memory (RHEL7.4)
I saw the same today. I put a /etc/cron.daily in place to restart it nightly and kill all processes. Stopping omsagent service does not clean up all processes. My cron script looks like this:
#!/bin/bash
systemctl stop 'omsagent*'
pkill -u omsagent
systemctl start 'omsagent*'
I have ansible in environment, normally I executes ad-hoc command 'kill $(ps -eo stat,ppid|grep -w Z|awk '{print $2}'|tr "\n" " ")' to clean up all dead processes if any. My concern is to mitigate this issue from root cause as I already updated newer version on the OMSAgent and associated dependencies.
Symptom: Every 5 minutes two new python processes are added to the process list which are not terminated. Same parend PID.
omsagent 22076 21695 0 14:32 ? 00:00:00 [python]
....
omsagent 31358 21695 0 16:02 ? 00:00:00 [python]
omsagent 31360 21695 0 16:02 ? 00:00:00 [python]
omsagent 31862 21695 0 16:08 ? 00:00:00 [python]
omsagent 31864 21695 0 16:08 ? 00:00:00 [python]
omsagent 32401 21695 0 16:13 ? 00:00:00 [python]
omsagent 32403 21695 0 16:13 ? 00:00:00 [python]
...
[root@srv21 yum.repos.d]# ps -ef |grep omsagent | wc -l
69
[root@srv21 yum.repos.d]#
Same parent PID: omsagent 21695 21663 0 14:28 ? 00:00:05 /opt/omi/bin/omiagent 11 14 --destdir / --providerdir /opt/omi/lib --idletimeout 90 --loglevel WARNING
I think issue occured after omsagent multi-homing with SCOM 2012 R2 was enabled. Restarting the omsagent does not remediate this. The omiserver was not yet restarted.
Error in /var/opt/omi/log/omiserver.log: 2016/02/22 17:33:35 [21663,21663] WARNING: null(0): EventId=30131 Priority=WARNING wsman: authentication failed for user [opsuser] in the same intervals.
Even if this is related I would assume it is not desirable to fill up the process list.