eclipse-ee4j / glassfish

Eclipse GlassFish
https://eclipse-ee4j.github.io/glassfish/
386 stars 144 forks source link

Glassfish eats all CPU during OSGi Bundle deployment #12938

Closed glassfishrobot closed 14 years ago

glassfishrobot commented 14 years ago

Hi *,

in promoted build 14 I encountered a problem when deploying a bunch of bundles at once to glassfish using the directory .../autodeploy/bundles/.

The problem is that glassfish begins to deploy but then somehow gets confused, resulting in really stressing my 4-core AMD64:

Tasks: 279 total, 1 running, 275 sleeping, 0 stopped, 3 zombie Cpu(s): 99.3%us, 0.7%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 8061556k total, 6738908k used, 1322648k free, 20936k buffers Swap: 11880032k total, 204344k used, 11675688k free, 1467324k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 26972 chaoslay 20 0 2747m 1.1g 5144 S 392 14.5 2223:33 java

This happened at server startup but also occurred in another form (eating "just" 2 cores during hot-deployment).

I'll attach a server.log (from netbeans server log view, not the original file as it hasn't been written by glassfish) containing the complete startup and additionally I'll attach a small strace -f from the glassfish java process during its hang.

Environment

Operating System: Linux Platform: Linux

Affected Versions

[3.1]

glassfishrobot commented 6 years ago
glassfishrobot commented 14 years ago

@glassfishrobot Commented chaoslayer@java.net said: Created an attachment (id=4662) strace -f from glassfish java process

glassfishrobot commented 14 years ago

@glassfishrobot Commented chaoslayer@java.net said: Created an attachment (id=4663) Glassfish log formatted by NetBeans

glassfishrobot commented 14 years ago

@glassfishrobot Commented @honghzzhang said: assign to sahoo for initial evaluation

glassfishrobot commented 14 years ago

@glassfishrobot Commented ss141213 said: Can you please supply a test case? Thanks ahead.

glassfishrobot commented 14 years ago

@glassfishrobot Commented chaoslayer@java.net said: Oh well, I'll try to reproduce that with a minimal effort, but currently I have 33 bundles in there and the WAB that is within deployment depends on almost all others. Additionally this is apparently not an open source project so I cannot make available all those things. All together they are more than 59 MiB in size.

But on the other hand I just made some tests and thought I'd raise the value for the PermGen size from my initial value of 256m to 512m and guess what, the deployment problem and high CPU load is completely gone. Contrary to that I also tried what happens if I reduce the size to say 128m so I expected a hang much sooner or a hard OutOfMemoryError here, but nothing. Deployed just fine.

So now I have no clue what really triggered this issue. But I guess it has something to do with the deployment order because now the affecting WAB does get deployed much earlier in the server startup.

So I just issued a redeploy (touch) of the WAB and there we have it again. Not a real OutOfMemoryError but taking around 30 minutes to deploy a web application that usually made it in 2-3 seconds. In addition the CPU load increases at first to only one core but increases over time until all my cores are eaten up.

In the end I think this could be just an issue on my side not paying much attention to the OSGi classloader principles yet (we really need some guidelines/best-practices here with a highly modular/dynamic enterprise application).

Btw. I'll see if I can build up some WAB that triggers this issue which is much less in size and can be used as a test case.

glassfishrobot commented 14 years ago

@glassfishrobot Commented ss141213 said: I think you should change GlassFish log level from FINE/FINER/FINEST to INFO, since you are now suspecting some OSGi issue here. More over, instead of providing strace.log, send a Java thread dump (jstack ) when you see all your cpus occupied. Including Richard who may have better suggestions.

glassfishrobot commented 14 years ago

@glassfishrobot Commented chaoslayer@java.net said: Created an attachment (id=4669) Thread dump from glassfish (only 1 core at 100%)

glassfishrobot commented 14 years ago

@glassfishrobot Commented chaoslayer@java.net said: Created an attachment (id=4670) Thread dump from glassfish (2 cores at 100%)

glassfishrobot commented 14 years ago

@glassfishrobot Commented chaoslayer@java.net said: Changing the glassfish log level didn't change anything in behavior.

glassfishrobot commented 14 years ago

@glassfishrobot Commented chaoslayer@java.net said: Created an attachment (id=4671) Thread dump from glassfish (3 cores at 100%)

glassfishrobot commented 14 years ago

@glassfishrobot Commented ss141213 said: Jerome,

Seems like annotation parser in hk2 has a bug leading to infinite loop. Pl. look into this issue. Is it possible to configure this scanning single threaded or make use of Synchonized collections?

Thanks, Sahoo

glassfishrobot commented 14 years ago

@glassfishrobot Commented chaoslayer@java.net said: Created an attachment (id=4675) Thread dump from glassfish b15 (2 cores at 100%)

glassfishrobot commented 14 years ago

@glassfishrobot Commented chaoslayer@java.net said: ^^ Also verified the same behavior with b15.

glassfishrobot commented 14 years ago

@glassfishrobot Commented chaoslayer@java.net said: Created an attachment (id=4682) Web Application that produces such a CPU load

glassfishrobot commented 14 years ago

@glassfishrobot Commented chaoslayer@java.net said: ^^ Attached a test case WAB (~ 6.0 MiB). Sometimes the deployment gets stuck when glassfish comes up and sometimes only after one or two "touches" of that bundle.

glassfishrobot commented 14 years ago

@glassfishrobot Commented ss141213 said: Re-categorising since this is a generic deployment issue. Marking the issue started as Jerome has started to investigate this.

glassfishrobot commented 14 years ago

@glassfishrobot Commented hwellmann said: I have the same kind of problem on 3.1-b15 with a plain old WAR (no OSGI headers), using asadmin deploy.

glassfishrobot commented 14 years ago

@glassfishrobot Commented dochez said: some threading issues were uncoverered when parsing mutilple jar files simultaneously. Fixed all discovered issues, will be available in b16.

glassfishrobot commented 14 years ago

@glassfishrobot Commented chaoslayer@java.net said: Thanx for fixing this issue.

glassfishrobot commented 14 years ago

@glassfishrobot Commented File: gf-strace.log Attached By: chaoslayer@java.net

glassfishrobot commented 14 years ago

@glassfishrobot Commented File: jstack.txt Attached By: chaoslayer@java.net

glassfishrobot commented 14 years ago

@glassfishrobot Commented File: jstack_2.txt Attached By: chaoslayer@java.net

glassfishrobot commented 14 years ago

@glassfishrobot Commented File: jstack_2_b15.txt Attached By: chaoslayer@java.net

glassfishrobot commented 14 years ago

@glassfishrobot Commented File: jstack_3.txt Attached By: chaoslayer@java.net

glassfishrobot commented 14 years ago

@glassfishrobot Commented File: server.netbeans.log Attached By: chaoslayer@java.net

glassfishrobot commented 14 years ago

@glassfishrobot Commented File: test.webapp-0.0.1-SNAPSHOT.war Attached By: chaoslayer@java.net

glassfishrobot commented 14 years ago

@glassfishrobot Commented Was assigned to dochez

glassfishrobot commented 7 years ago

@glassfishrobot Commented This issue was imported from java.net JIRA GLASSFISH-12938

glassfishrobot commented 14 years ago

@glassfishrobot Commented Reported by chaoslayer@java.net

glassfishrobot commented 14 years ago

@glassfishrobot Commented Marked as fixed on Wednesday, August 18th 2010, 10:01:46 am