massdosage / citrine-scheduler

Java web application which can be used to configure, manage and monitor the running of various tasks
Apache License 2.0
4 stars 4 forks source link

Somehow accumulates open pipe, without closing them. #2

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
We hit the too many open files limit running citrine on one of our servers.
(ulimit -n = 1024) The seem to be open pipes just hanging around.

eg:

$ lsof | grep pipe | grep java | head
java      20874         mir    0r     FIFO                0,5           
73871645 pipe
java      20874         mir    1w     FIFO                0,5           
73871646 pipe
java      20874         mir    2w     FIFO                0,5           
73871647 pipe
java      27822         mir   41w     FIFO                0,5           
71105954 pipe
java      27822         mir   42w     FIFO                0,5           
71107391 pipe
java      27822         mir   43w     FIFO                0,5           
71103896 pipe
java      27822         mir   44w     FIFO                0,5           
71105110 pipe
java      27822         mir   45w     FIFO                0,5           
71111993 pipe
java      27822         mir   46w     FIFO                0,5           
71113395 pipe
java      27822         mir   47w     FIFO                0,5           
71108865 pipe

$ lsof | grep pipe | grep java | wc -l
963

$ ulimit -n
1024

$ ps x | grep java
27822 ?        Sl     4:23 /usr/lib/jvm/java-6-sun/bin/java
-Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager 
-Djava.util.logging.config.file=/usr/share/tomcat/conf/logging.properties
-Djava.endorsed.dirs=/usr/share/tomcat/endorsed 
-classpath :/usr/share/tomcat/bin/bootstrap.jar
-Dcatalina.base=/usr/share/tomcat -Dcatalina.home=/usr/share/tomcat 
-Djava.io.tmpdir=/usr/share/tomcat/temp
org.apache.catalina.startup.Bootstrap start
27901 ?        Z      0:00 [java] <defunct>
20874 ?        Sl     3:19 java -cp build/classes:lib/*
fm.last.dbjobs.StrayTrackCorrector tmp/artist_stray_list
tmp/updated_stray_tracks
26738 pts/24   S+     0:00 grep java

The defunct process is probably a red herring. There isn't one on another
server, but there does seems to be the same problem. The open file limit on
the other server is big enough we've just never hit it. The stray track
corrector is a long running job, I don't think it is the cause, looking at
it's code it doesn't seem to do anything weird in the java code, it's bash
start script just does some standard awking.

Temporary solution, increase ulimit.

On the other server
lsof | grep pipe | grep java | wc -l
720
ulimit -n
65000

Original issue reported on code.google.com by massdosage on 7 Jan 2010 at 6:40

GoogleCodeExporter commented 9 years ago
Not enough information to reproduce.

Original comment by massdosage on 19 May 2011 at 2:07