testdriven / psi-probe

Advanced manager and monitor for Apache Tomcat, forked from Lambda Probe
8 stars 1 forks source link

Race condition between main thread and QuartzScheduler worker threads #404

Open padcom opened 9 years ago

padcom commented 9 years ago

From robertos...@gmail.com on April 09, 2014 10:14:14

What steps will reproduce the problem? 1. use JDK to 1.7.0_51 on CentOS6_64

  1. Set Tomcat to use the AJP13 NIO connector
  2. start Tomcat What is the expected result? normal startup What happens instead? Exception in probe.log. See attachment. What version of Probe are you using? 2.3.0 What environment (browser version, Tomcat version, JVM version, server OS)? CentOS 6.5_64 (latest version including the workaround OpenSSL interim package issued last night) Please provide any additional information below. (Attach logs or stack traces as files instead of pasting the contents here.) Tomcat server.xml AJP config:

    <Connector port="8009" protocol="org.apache.coyote.ajp.AjpNioProtocol" maxThreads="10000" connectionTimeout="600000" redirectPort="9443" />

JNDI Datasource definition (using context.xml config in conf/Catalina/localhost for an application):

<Resource name="jdbc/xxx" auth="Container" type="javax.sql.DataSource" maxActive="100" maxIdle="30" maxWait="10000" validationQuery = "select 1" testWhileIdle = "true" timeBetweenEvictionRunsMillis = "60000" removeAbandoned="true" removeAbandonedTimeout="240" logAbandoned="true" driverClassName="com.mysql.jdbc.Driver" username="xxx" password="yyy" url="jdbc:mysql://localhost:3306/xxx"/>

Attachment: probe2.3.0_crash_nio_with_jndi.txt

Original issue: http://code.google.com/p/psi-probe/issues/detail?id=404

padcom commented 9 years ago

From robertos...@gmail.com on April 09, 2014 01:16:39

It seems I reported a similliar issue a while back in #171

padcom commented 9 years ago

From robertos...@gmail.com on April 09, 2014 01:19:20

Updated to 2.3.3, error didn't occur again. Please close.

2.3.3 startup log: 09 Apr 2014 10:22:01 [pool-2-thread-2] INFO com.googlecode.psiprobe.ProbeServlet - FrameworkServlet 'probe': initialization started 09 Apr 2014 10:22:01 [pool-2-thread-2] INFO org.springframework.beans.factory.xml.XmlBeanDefinitionReader - Loading XML bean definitions from ServletContext resource [/WEB-INF/spring-probe-servlet.xml] 09 Apr 2014 10:22:01 [pool-2-thread-2] INFO org.springframework.beans.factory.xml.XmlBeanDefinitionReader - Loading XML bean definitions from ServletContext resource [/WEB-INF/spring-probe-resources.xml] 09 Apr 2014 10:22:01 [pool-2-thread-2] INFO org.springframework.beans.factory.xml.XmlBeanDefinitionReader - Loading XML bean definitions from ServletContext resource [/WEB-INF/spring-probe-stats.xml] 09 Apr 2014 10:22:01 [pool-2-thread-2] INFO org.springframework.beans.factory.xml.XmlBeanDefinitionReader - Loading XML bean definitions from ServletContext resource [/WEB-INF/spring-probe-controllers.xml] 09 Apr 2014 10:22:02 [pool-2-thread-2] INFO org.springframework.beans.factory.config.PropertyPlaceholderConfigurer - Loading properties file from ServletContext resource [/WEB-INF/stats.properties] 09 Apr 2014 10:22:02 [pool-2-thread-2] INFO org.springframework.beans.factory.config.PropertiesFactoryBean - Loading properties file from ServletContext resource [/WEB-INF/version.properties] 09 Apr 2014 10:22:02 [pool-2-thread-2] INFO org.quartz.core.QuartzScheduler - Quartz Scheduler v.1.6.0 created. 09 Apr 2014 10:22:02 [pool-2-thread-2] INFO org.quartz.simpl.RAMJobStore - RAMJobStore initialized. 09 Apr 2014 10:22:02 [pool-2-thread-2] INFO org.quartz.impl.StdSchedulerFactory - Quartz scheduler 'scheduler' initialized from an externally provided properties instance. 09 Apr 2014 10:22:02 [pool-2-thread-2] INFO org.quartz.impl.StdSchedulerFactory - Quartz scheduler version: 1.6.0 09 Apr 2014 10:22:02 [pool-2-thread-2] INFO org.quartz.core.QuartzScheduler - JobFactory set to: org.springframework.scheduling.quartz.AdaptableJobFactory@5255e17f 09 Apr 2014 10:22:02 [pool-2-thread-2] INFO org.springframework.scheduling.quartz.SchedulerFactoryBean - Starting Quartz Scheduler now 09 Apr 2014 10:22:02 [pool-2-thread-2] INFO org.quartz.core.QuartzScheduler - Scheduler scheduler_$_NON_CLUSTERED started. 09 Apr 2014 10:22:02 [pool-2-thread-2] INFO com.googlecode.psiprobe.ProbeServlet - FrameworkServlet 'probe': initialization completed in 709 ms 09 Apr 2014 10:22:02 [pool-2-thread-2] INFO com.googlecode.psiprobe.beans.ContainerWrapperBean - Server info: Apache Tomcat/7.0.23 09 Apr 2014 10:22:02 [pool-2-thread-2] INFO com.googlecode.psiprobe.beans.ContainerWrapperBean - Using com.googlecode.psiprobe.Tomcat70ContainerAdaptor 09 Apr 2014 10:22:30 [scheduler_Worker-2] INFO com.googlecode.psiprobe.beans.ContainerWrapperBean - Using DEFAULT resource resolver

padcom commented 9 years ago

From MALfunct...@gmail.com on April 11, 2014 13:32:08

Offending line from issue 171 : https://code.google.com/p/psi-probe/source/browse/tags/2.1.1/core/src/main/java/com/googlecode/psiprobe/beans/ContainerWrapperBean.java#152 Before this was fixed, the ContainerWrapperBean was not initialized immediately on deployment/startup. Issue 196 also reports a similar problem. In that case, the ContainerWrapperBean did not initialize because it could not detect the type of server it was deployed to.

Offending line from this issue: https://code.google.com/p/psi-probe/source/browse/tags/2.3.0/core/src/main/java/com/googlecode/psiprobe/beans/ContainerListenerBean.java#120 The collection tasks running in the scheduler_Worker threads depend on ContainerWrapperBean already being initialized. I think this issue is caused by a race condition between the thread that is initializing ContainerWrapperBean and the scheduler_Worker thread. You can tell when the ContainerWrapperBean finishes initializing by looking for the "Using com.googlecode.psiprobe.Tomcat70ContainerAdaptor" in the logs.

I reviewed changes between 2.3.0 and 2.3.3. No changes made appear to have affected the order these are executed in. Issue 342 may help address this, but I want to ensure it doesn't happen again. Perhaps I can delay the start of the QuartzScheduler until after ContainerWrapperBean has initialized. Both of those occur in the same thread.

Summary: Race condition between main thread and QuartzScheduler worker threads (was: NPE on Startup when using AJP NIO connector and JNDI on Tomcat 7.0.23)
Status: Accepted
Labels: -Milestone-TBD Milestone-2.4.0

padcom commented 9 years ago

From MALfunct...@gmail.com on April 11, 2014 16:34:04

Labels: Stacktrace