baweaver / psutil

Automatically exported from code.google.com/p/psutil
Other
0 stars 0 forks source link

Monitoring Java Servers On Mac OS Lion Causes Monitored Server To Segfault #277

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?

Running the following code to monitor a Java based server on Mac OS Lion causes 
the monitored server to seg fault. I tried this with several industrial grade 
servers (ActiveMQ, Tomcat etc) and each time the server crashed after several 
minutes when the script is run on five second loop. 

I want to be clear - it is not the python script that fails - it is the java 
server process that is being monitored that consistently segfaults. I tried 
with non-java processes (firefox etc) and did not observe the same behavior. 

{{{
#!/usr/bin/env python

import psutil
import sys

proc = None;

#find the server we are looking for
for ps in psutil.process_iter():
    #print ps.name
    try:
        if( ps.name == "java" ):
            for cmd in ps.cmdline:
                if cmd.count("apache-activemq-5.4.2") > 0:
                    proc = ps;
                    break
        if proc is not None:
            break

    except Exception, e:
        pass

if not proc:
    print "SERVER NOT RUNNING..."
    sys.exit(1)

print " CPU:    {0:15.1f}%".format(proc.get_cpu_percent())
print " U Time: {0:15.1f}s".format(proc.get_cpu_times().user)
print " S Time: {0:15.1f}s".format(proc.get_cpu_times().system)
print " Memory: {0:15.1f}%".format(proc.get_memory_percent())
print " Threads:{0:13d}".format( proc.get_num_threads() )
print " Files:  {0:13d}".format( len(proc.get_open_files()) )
print " INET:   {0:13d}".format( len(proc.get_connections()) )

}}}

What is the expected output?

The service being monitored should continue to run

What do you see instead?

Segmentation fault: 11

What version of psutil are you using? What Python version?

psutil 0.4.1
python 2.7

On what operating system? Is it 32bit or 64bit version?

Mac Lion - 64 bit

Please provide any additional information below.

Original issue reported on code.google.com by shane.c....@gmail.com on 10 Jun 2012 at 3:16

GoogleCodeExporter commented 8 years ago
Hi Shane,

Since you're able to reproduce the problem simply on your system, can you try 
narrowing down the steps to reproduce to the smallest test case? For example, 
does the problem happen due to use of one of these specific calls below? 

print " CPU:    {0:15.1f}%".format(proc.get_cpu_percent())
print " U Time: {0:15.1f}s".format(proc.get_cpu_times().user)
print " S Time: {0:15.1f}s".format(proc.get_cpu_times().system)
print " Memory: {0:15.1f}%".format(proc.get_memory_percent())
print " Threads:{0:13d}".format( proc.get_num_threads() )
print " Files:  {0:13d}".format( len(proc.get_open_files()) )
print " INET:   {0:13d}".format( len(proc.get_connections()) )

It would be very helpful to determine specifically which feature of psutil 
seems to be causing a problem for the Java process. If you are getting a 
hotspot crash dump from the JVM that would also be helpful to include here. 

Thanks

Original comment by jlo...@gmail.com on 10 Jun 2012 at 3:57

GoogleCodeExporter commented 8 years ago
Yes. I had actually been doing this in the background - I tried running each 
one of these individually and could not reproduce the segfault after running 
~10 minutes each. Within minutes of starting them all again, the segfault 
happened again. So, it appears to not be a single call, but some combination of 
multiple. I will try combining and see what I can come up with. 

Original comment by shane.c....@gmail.com on 10 Jun 2012 at 4:20

GoogleCodeExporter commented 8 years ago
OK - I have caused it to happen with this combination:

print " S Time: {0:15.1f}s".format(proc.get_cpu_times().system)
print " Memory: {0:15.1f}%".format(proc.get_memory_percent())
print " Threads:{0:13d}".format( proc.get_num_threads() )
print " Files:  {0:13d}".format( len(proc.get_open_files()) )

This was the smallest combination that I could get it to happen with. Is it 
possible that this is a timing issue - and not really dependent on what we are 
doing - but how long we are doing it for (ie, the longer I spend working with 
the proc object, the greater the chance that the error will occur)? If so, I 
could reduce the amount of time by building the string and then printing it all 
at once - but I don't like the idea that the thing I am using to monitor my 
applications is the one that it murdering them :)

I will turn debugging on in the jvm and see if I can get more information there.

Original comment by shane.c....@gmail.com on 10 Jun 2012 at 5:37

GoogleCodeExporter commented 8 years ago
Any news about this?

Original comment by g.rodola on 24 Feb 2013 at 9:59

GoogleCodeExporter commented 8 years ago
psutil has been migrated from Google Code to Github (see: 
http://grodola.blogspot.com/2014/05/goodbye-google-code-im-moving-to-github.html
).
Please do NOT reply here but use this instead:
https://github.com/giampaolo/psutil/issues/277

Original comment by g.rodola on 26 May 2014 at 3:08