spring-projects / spring-loaded

Java agent that enables class reloading in a running JVM
Apache License 2.0
2.72k stars 515 forks source link

Multiple server threads hang in java.util.WeakHashMap.get() #194

Closed jonaswitt closed 6 years ago

jonaswitt commented 7 years ago

I have a Grails project using these depencies:

When running the Grails server in development mode (which has spring-loaded enabled), it often happens that after a few requests to the server, more and more worker threads are hanging at full CPU usage as seen in this jstack trace: spring-loaded-hang-jtrace.txt - notice the threads whose last lines are

"qtp1426638617-60" #60 prio=5 os_prio=31 tid=0x00007fb1851ac000 nid=0xa503 runnable [0x000070000de48000]
   java.lang.Thread.State: RUNNABLE
    at java.util.WeakHashMap.get(WeakHashMap.java:403)
    at org.springsource.loaded.ri.ReflectiveInterceptor.getRType(ReflectiveInterceptor.java:929)
    at org.springsource.loaded.ri.ReflectiveInterceptor.jlClassGetConstructor(ReflectiveInterceptor.java:522)
    at org.codehaus.groovy.reflection.ClassLoaderForClassArtifacts.defineClassAndGetConstructor(ClassLoaderForClassArtifacts.java:83)
    at org.codehaus.groovy.runtime.callsite.CallSiteGenerator.compileStaticMethod(CallSiteGenerator.java:246)
    at org.codehaus.groovy.reflection.CachedMethod.createStaticMetaMethodSite(CachedMethod.java:290)
    at org.codehaus.groovy.runtime.callsite.StaticMetaMethodSite.createStaticMetaMethodSite(StaticMetaMethodSite.java:114)
    at groovy.lang.MetaClassImpl.createStaticSite(MetaClassImpl.java:3383)

(Re-running jstack confirms that each hanging thread is still in the very same location, although I cannot determine at which level of that stack trace looping occurs.)

This happens without changing any source files (i.e. I think it's not related to spring-loaded actually reloading some classes). It also never happens in production mode (where spring-loaded is disabled). In development mode, it consistently happens across 6 dev machines (Mac OS + Linux), in approx. 20% of all dev server starts. Once the dev server makes it beyond some initial requests unharmed (i.e. when all worker threads have been used at least once...?), this problem does not happen.

It looks to me like there is probably at least some interplay between Groovy and spring-loaded that may be causing this. Has anyone else seen this? Can anyone with deeper knowledge of spring-loaded read more from the stack trace than I can? Any help is appreciated - with frequent dev server restarts during my work day, this problem quickly becomes very annoying... :-/

irstevenson commented 7 years ago

I'm seeing the same, but slightly different stack:

Pretty sure I've not seen it in production, but becoming more frequent on my DEV machine. Just now it was in two Quartz Worker Threads that were executing SpringBatch jobs.

zhuravskiy commented 7 years ago

Same problem. on grails 3.2.8 with springloaded-1.2.7.RELEASE

part of jvm args

-Dspringloaded=inclusions=grails.plugins..*;synchronize=true;allowSplitPackages=true;cacheDir=/home/vitaliy/IdeaProjects/project/build/springloaded  -javaagent:/home/vitaliy/.gradle/caches/modules-2/files-2.1/org.springframework/springloaded/1.2.7.RELEASE/83d209e87380fd30a294cefd458ab6170722e80c/springloaded-1.2.7.RELEASE.jar

i disable grails quartz plugin in dev environment with hope to resolve problem

alexmsu75 commented 7 years ago

I'm having the same problem with JDK 1.8.141 and springloaded 1.2.5.RELEASE (or 1.2.7.RELEASE), it happens almost every time when I run web app in tomcat8 from Eclipse 4.7 IDE

davydotcom commented 7 years ago

happens often in threads named http-nio-8080-exec-10 in gsp view rendering tired to same springloaded interceptor

jonaswitt commented 7 years ago

Can you elaborate further? Where in https://github.com/apache/groovy/blob/183b8fbf0248a2dceffbba684e50c3c2c060e46c/src/main/org/codehaus/groovy/util/LockableObject.java do you suspect the problem?

Groovy's issue tracker does not seem to have a related issue yet (https://issues.apache.org/jira/browse/GROOVY-5249?jql=project%20%3D%20GROOVY%20AND%20text%20~%20LockableObject), but with some more details we could file one.

alexmsu75 commented 7 years ago

I don't think that this related to groovy as the application that I'm dealing with does not use it, unless eclipse injects it during startup debugger for tomcat

davydotcom commented 7 years ago

Digging further posting information I find here.

In the ReflectiveInterceptor a call is made to get a value from a WeakHashMap that is not synchronized. This causes an endless loop and full CPU [https://github.com/spring-projects/spring-loaded/blob/master/springloaded/src/main/java/org/springsource/loaded/ri/ReflectiveInterceptor.java#L929](ReflectiveInterceptor Line 929)

We can find information on why that is here: http://www.adam-bien.com/roller/abien/entry/endless_loops_in_unsychronized_weakhashmap

The trick is trying to figure out why this is getting concurrently accessed. My stack trace was a little longer than the one provided above and I noticed it went through the Groovy core LockableObject class which I potentially see a bug in. So it may be possible that this LockableObject is not causing a synchronized lock or its happening farther down the line within the ReflectiveInterceptor. However, this is a Static Method calling a static property implying it probably should be synchronized down within the ReflectiveInterceptor itself.

davydotcom commented 7 years ago

@alexmsu75 I think it is definitely a bug in ReflectiveInterceptor upon further digging. See previous comment and I think it makes sense as to whats going on

davydotcom commented 7 years ago

Fix in PR #202

aclement commented 6 years ago

think this needs closing

davydotcom commented 6 years ago

Yay!!

demon101 commented 6 years ago

Have the same problem with Grails 3.2.11