alphaville / jaqpot3

A brand new version of jaqpot - fully OpenTox API (1.2) compliant (see http://opentox.ntua.gr ) - lots of new features
http://opentox.ntua.gr
2 stars 1 forks source link

Server hangs on request flood #32

Closed alphaville closed 9 years ago

alphaville commented 9 years ago

The services hang when lots of requests are issued. I repeated the following and the whole WS hung!

task_uri=`curl -X POST -H subjectid:AQIC5wM2LY4Sfcy5t0Dn3DKUsKQiKF_hWCiVWuTwZa7svaQ.*AAJTSQACMDE.* \
-H Accept:text/uri-list -d dataset_uri=http://apps.ideaconsult.net:8080/ambit2/dataset/R545 \
-d prediction_feature=http://apps.ideaconsult.net:8080/ambit2/feature/22200 \
http://enanomapper.ntua.gr:8080/algorithm/mlr`; 
curl $task_uri > /dev/null;

then I couldn't do almost anything and, for example, the following request:

 curl http://enanomapper.ntua.gr:8080/task/ -H Accept:text/uri-list;

returned an empty response body and status code 200.

I had to restart jaqpot to get things working again. I suspect this has to do with the system configuration (e.g., MySQL configuration is currently the default one). It seems (according to javamelody) that it is not some memory management issue.

Additional information: At a certain point, after having restarted jaqpot (from the admin backend of tomcat) several times, I got the following error:

type Exception report

message

description The server encountered an internal error () that prevented it from fulfilling this request.

exception

javax.servlet.ServletException: javax.management.RuntimeErrorException: java.lang.OutOfMemoryError: PermGen space
    org.apache.catalina.manager.StatusManagerServlet.doGet(StatusManagerServlet.java:305)
    javax.servlet.http.HttpServlet.service(HttpServlet.java:621)
    javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
    org.apache.catalina.filters.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:108)

root cause

javax.management.RuntimeErrorException: java.lang.OutOfMemoryError: PermGen space
    com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.rethrow(DefaultMBeanServerInterceptor.java:841)
    com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.rethrowMaybeMBeanException(DefaultMBeanServerInterceptor.java:852)
    com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:651)
    com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678)
    org.apache.catalina.manager.StatusTransformer.writeProcessorState(StatusTransformer.java:368)
    org.apache.catalina.manager.StatusTransformer.writeConnectorState(StatusTransformer.java:301)
    org.apache.catalina.manager.StatusManagerServlet.doGet(StatusManagerServlet.java:290)
    javax.servlet.http.HttpServlet.service(HttpServlet.java:621)
    javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
    org.apache.catalina.filters.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:108)

root cause

java.lang.OutOfMemoryError: PermGen space

note The full stack trace of the root cause is available in the Apache Tomcat/7.0.26 logs.

ToxOtis/C3P0/Application Logs Logs are not very suggestive as of what the problem may be. When hell breaks loose, there are no particular messages in the logs!

Java Configuration I changed the Java configuration inside catalina.sh with the optimal parameters found here; still no success...

alphaville commented 9 years ago

Maybe the problem is solved by modifying a parameter in c3p0.properties (c3p0.maxPoolSize=1000). I have tested, but let's test more to make sure...

hampos commented 9 years ago

Consequent redeploying without restarting the container would lead to a PermGen space error. I don't believe this is related to jaqpot. Anyway, by migrating to java 8 we won't see a PermGen space error again, as that space will be infinite.

I am not sure how we handle our pooled connections, but if the system is flooded with requests and threads hold connections while in heavy cpu calculations or network traffic, it makes sense for the system to hang as new threads trying to accomodate new requests will find themselves blocked waiting for connections to be released.

alphaville commented 9 years ago

I think now the problem is solved - I think I have tested model training adequately. As I said before, I provided the property c3p0.maxPoolSize=1000 in c3p0.properties and I also added the following properties:

-Xmx2048m
-Xms2048m
-Xmn788m
-Xss256k
-XX:+UseLargePages -XX:LargePageSizeInBytes=1m
-XX:ParallelGCThreads=4
-XX:+UseConcMarkSweepGC
-XX:+UseParNewGC
-XX:SurvivorRatio=8
-XX:TargetSurvivorRatio=90
-XX:MaxTenuringThreshold=15
-XX:+UseBiasedLocking
-XX:+AggressiveOpts
-XX:CompileThreshold=1500
-XX:+UseFastAccessorMethods
-XX:MaxPermSize=128m
-Xverify:none
-Djava.net.preferIPv6Addresses=false
-Djava.net.preferIPv4Stack=true

in catalina.sh. We have also introduced a slight increase in -XX:MaxPermSize. Anyway, I am now closing the issue.