Closed timja closed 2 years ago
With Jenkins 2.204.6, Oracle JDK 8 and the workflow-cps-plugin version 2.83 the problem does not occur.
I tested this with both official docker images: jenkins/jenkins:jdk11 and jenkins/jenkins:2.259-slim (jdk 8).
While the problem of shoelzle also occures on jenkins:jdk11 (with nothing else installed than workflow-job and workflow-cps), it does not occure on the jdk 8 image (jenkins:2.259-slim).
Same behavior with AdoptOpenJDK13 and Oracle JDK 11. Also changing the GC via -XX:+UseG1GC or -XX:+UseParallelGC didn't solve the problem.
Can some take care of this issue? Seems to be a serious bug when not using java 8.
Due to this issue we need to restart our jenksin master instance every couple of days.
Just quick update. The root cause of memory leak was :
The issue was fixed by
CC jglick markewaite. This could potentially be a blocker.
Tried to reproduce using MemoryAssert.assertGC without success. If reproducible, someone would need to look at a heap dump and see what is actually being held, and why https://github.com/jenkinsci/workflow-cps-plugin/blob/52d3de6044d0d3373ce513a5ff2f1de15d572d5d/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsFlowExecution.java#L1281-L1436 does not take care of it.
I did a clean installation of Jenkins with nothing but the suggested plugins and ran the Pipeline script provided by the reporter. I confirmed (by increasing logging) that https://github.com/jenkinsci/workflow-cps-plugin/blob/52d3de6044d0d3373ce513a5ff2f1de15d572d5d/src/main/java/org/jenkinsci/plugins/workflow/cps/CpsFlowExecution.java#L1281-L1436 found the asdf class as expected. I then took a heap dump and searched for the asdf class in the heap dump. I found asdf in the list of objects, with a count of 0 and a retained size of 0 bytes. Based on the above, I conclude the original issue as submitted by the reporter is no longer valid. I am closing this bug as Cannot Reproduce.
basil,
Please note that the Metaspace is not part of the Java Heap Memory.
And It's not the Java Heap that grows from that Memory leak, but the Metaspace, more specifically the CompressedClassSpace part of the Metaspace.
So it's plausible for me that you don't see any space blocked in the Heap Memory.
The issue ist still present in Jenkins 2.344 run in a Java 11 JVM.
Thank you shoelzle for the clarification. I can now reproduce the problem. I created a new Pipeline job with
class asdf { void run () {} } new asdf().run()
When running the above Pipeline under Java 8 with -verbose:class:
[Loaded asdf from file:/groovy/shell] [Loaded WorkflowScript from file:/groovy/shell] […] // jcmd ${PID} GC.run [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor33 0x00000007c0878428] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor32 0x00000007c0878028] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor31 0x00000007c0877c28] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor30 0x00000007c0877828] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor29 0x00000007c0877428] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor28 0x00000007c0877028] [Unloading class WorkflowScript 0x00000007c084b290] [Unloading class asdf 0x00000007c084b028] [Unloading class org.eclipse.jetty.server.session.DefaultSessionCache$$Lambda$302/1809148827 0x00000007c07ec828] [Loaded sun.reflect.GeneratedConstructorAccessor84 from __JVM_DefineClass__]
When running the same Pipeline under Java 11 with -verbose:class:
[10.332s][info][class,load ] asdf source: file:/groovy/shell [10.337s][info][class,load ] WorkflowScript source: file:/groovy/shell […] // jcmd ${PID} GC.run [33.509s][info][class,load ] jdk.internal.reflect.GeneratedConstructorAccessor81 source: __JVM_DefineClass__ [38.551s][info][class,load ] jdk.internal.reflect.GeneratedConstructorAccessor82 source: __JVM_DefineClass__ [38.552s][info][class,load ] jdk.internal.reflect.GeneratedMethodAccessor27 source: __JVM_DefineClass__ [38.553s][info][class,load ] jdk.internal.reflect.GeneratedMethodAccessor28 source: __JVM_DefineClass__
It can clearly be seen that the asdf class is loaded in both cases but is only unloaded from the Metaspace in Java 8.
The problem appears to be resolved when running under Java 17:
[38.700s][info][class,load] asdf source: file:/groovy/shell [38.702s][info][class,load] WorkflowScript source: file:/groovy/shell […] // jcmd ${PID} GC.run [63.316s][info][class,load] jdk.internal.reflect.GeneratedMethodAccessor76 source: __ClassDefiner__ [63.317s][info][class,load] jdk.internal.reflect.GeneratedMethodAccessor77 source: __ClassDefiner__ [64.378s][info][class,unload] unloading class jdk.internal.reflect.GeneratedSerializationConstructorAccessor33 0x00000008013b4400 [64.378s][info][class,unload] unloading class jdk.internal.reflect.GeneratedSerializationConstructorAccessor32 0x00000008013b4000 [64.378s][info][class,unload] unloading class jdk.internal.reflect.GeneratedSerializationConstructorAccessor31 0x00000008013b3c00 [64.378s][info][class,unload] unloading class jdk.internal.reflect.GeneratedSerializationConstructorAccessor30 0x00000008013b3800 [64.378s][info][class,unload] unloading class jdk.internal.reflect.GeneratedSerializationConstructorAccessor29 0x00000008013b3400 [64.378s][info][class,unload] unloading class jdk.internal.reflect.GeneratedSerializationConstructorAccessor28 0x00000008013b3000 [64.378s][info][class,unload] unloading class WorkflowScript 0x0000000801394280 [64.378s][info][class,unload] unloading class asdf 0x0000000801394000
I ran the above test 10 times (all on Linux x86_64) and found the following results:
Temurin 1.8.0_332-b09 - PASS
AdoptOpenJDK 9.0.4+11 - FAIL
AdoptOpenJDK 10.0.2+13 - FAIL
Temurin 11.0.15+10 - FAIL
AdoptOpenJDK 12.0.2+10 - FAIL
AdoptOpenJDK 13.0.2+8 - FAIL
AdoptOpenJDK 14.0.2+12 - FAIL
AdoptOpenJDK 15.0.2+7 - FAIL
Temurin 16.0.2+7 - PASS
Temurin 17.0.3+7 - PASS
So Java versions 9 through 15 are affected.
The last failing commit is openjdk/jdk@42a6eadb853. The first passing commit is openjdk/jdk@2ee2b4ae19d. I confirmed that backporting openjdk/jdk@2ee2b4ae19d to jdk11u-dev resolves the problem on Java 11.
The upstream bug is JDK-8231454. As described in the upstream bug, a memory memory leak can occur when java.beans.Introspector#getBeanInfo is invoked against a class, even if java.beans.Introspector#flushFromCaches is later called on that class.
In our case, java.beans.Introspector#getBeanInfo is invoked from the following stack trace:
getBeanInfo:196, Introspector (java.beans) run:3328, MetaClassImpl$15 (groovy.lang) doPrivileged:-1, AccessController (java.security) addProperties:3326, MetaClassImpl (groovy.lang) initialize:3303, MetaClassImpl (groovy.lang) getMetaClassUnderLock:289, ClassInfo (org.codehaus.groovy.reflection) getMetaClass:331, ClassInfo (org.codehaus.groovy.reflection) getMetaClass:277, MetaClassRegistryImpl (org.codehaus.groovy.runtime.metaclass) getMetaClass:905, InvokerHelper (org.codehaus.groovy.runtime) createCallConstructorSite:86, CallSiteArray (org.codehaus.groovy.runtime.callsite) defaultCallConstructor:59, CallSiteArray (org.codehaus.groovy.runtime.callsite) callConstructor:238, AbstractCallSite (org.codehaus.groovy.runtime.callsite) call:208, Checker$3 (org.kohsuke.groovy.sandbox.impl) onNewInstance:42, GroovyInterceptor (org.kohsuke.groovy.sandbox) onNewInstance:173, SandboxInterceptor (org.jenkinsci.plugins.scriptsecurity.sandbox.groovy) call:205, Checker$3 (org.kohsuke.groovy.sandbox.impl) checkedConstructor:210, Checker (org.kohsuke.groovy.sandbox.impl) constructorCall:21, SandboxInvoker (com.cloudbees.groovy.cps.sandbox) dispatchOrArg:97, FunctionCallBlock$ContinuationImpl (com.cloudbees.groovy.cps.impl) fixName:78, FunctionCallBlock$ContinuationImpl (com.cloudbees.groovy.cps.impl) invoke0:-1, NativeMethodAccessorImpl (jdk.internal.reflect) invoke:62, NativeMethodAccessorImpl (jdk.internal.reflect) invoke:43, DelegatingMethodAccessorImpl (jdk.internal.reflect) invoke:566, Method (java.lang.reflect) receive:72, ContinuationPtr$ContinuationImpl (com.cloudbees.groovy.cps.impl) eval:21, ConstantBlock (com.cloudbees.groovy.cps.impl) step:83, Next (com.cloudbees.groovy.cps) call:174, Continuable$1 (com.cloudbees.groovy.cps) call:163, Continuable$1 (com.cloudbees.groovy.cps) use:136, GroovyCategorySupport$ThreadCategoryInfo (org.codehaus.groovy.runtime) use:275, GroovyCategorySupport (org.codehaus.groovy.runtime) run0:163, Continuable (com.cloudbees.groovy.cps) access$001:18, SandboxContinuable (org.jenkinsci.plugins.workflow.cps) run0:51, SandboxContinuable (org.jenkinsci.plugins.workflow.cps) runNextChunk:187, CpsThread (org.jenkinsci.plugins.workflow.cps) run:420, CpsThreadGroup (org.jenkinsci.plugins.workflow.cps) access$400:95, CpsThreadGroup (org.jenkinsci.plugins.workflow.cps) call:330, CpsThreadGroup$2 (org.jenkinsci.plugins.workflow.cps) call:294, CpsThreadGroup$2 (org.jenkinsci.plugins.workflow.cps) call:67, CpsVmExecutorService$2 (org.jenkinsci.plugins.workflow.cps) run:264, FutureTask (java.util.concurrent) run:139, SingleLaneExecutorService$1 (hudson.remoting) run:28, ContextResettingExecutorService$1 (jenkins.util) run:68, ImpersonatingExecutorService$1 (jenkins.security) call:515, Executors$RunnableAdapter (java.util.concurrent) run:264, FutureTask (java.util.concurrent) runWorker:1128, ThreadPoolExecutor (java.util.concurrent) run:628, ThreadPoolExecutor$Worker (java.util.concurrent) run:829, Thread (java.lang)
java.beans.Introspector#flushFromCaches is later called on that class, but to no avail:
cleanUpLoader:1300, CpsFlowExecution (org.jenkinsci.plugins.workflow.cps) cleanUpHeap:1269, CpsFlowExecution (org.jenkinsci.plugins.workflow.cps) run:464, CpsThreadGroup (org.jenkinsci.plugins.workflow.cps) access$400:95, CpsThreadGroup (org.jenkinsci.plugins.workflow.cps) call:330, CpsThreadGroup$2 (org.jenkinsci.plugins.workflow.cps) call:294, CpsThreadGroup$2 (org.jenkinsci.plugins.workflow.cps) call:67, CpsVmExecutorService$2 (org.jenkinsci.plugins.workflow.cps) run:264, FutureTask (java.util.concurrent) run:139, SingleLaneExecutorService$1 (hudson.remoting) run:28, ContextResettingExecutorService$1 (jenkins.util) run:68, ImpersonatingExecutorService$1 (jenkins.security) call:515, Executors$RunnableAdapter (java.util.concurrent) run:264, FutureTask (java.util.concurrent) runWorker:1128, ThreadPoolExecutor (java.util.concurrent) run:628, ThreadPoolExecutor$Worker (java.util.concurrent) run:829, Thread (java.lang)
jenkinsci/workflow-cps-plugin#543 has been released in 2705.v0449852ee36f and jenkinsci/script-security-plugin#420 has been released in 1175.v4b_d517d6db_f0, but upgrading to these releases is not sufficient: you also need jenkinsci/jenkins#6597 (planned for eventual release in weeklies and backport to LTS) or to start Jenkins with --add-opens java.desktop/com.sun.beans.introspect=ALL-UNNAMED in your Java options.
jochenafuerbacher nsewardep phwolf shoelzle Can you please test the short-term solution and confirm that the issue has been resolved? Thanks!
openjdk/jdk11u-dev#1106 and openjdk/jdk11u-dev#1103 have been proposed to backport the relevant upstream commits from OpenJDK 17 to OpenJDK 11.
[Originally duplicated by: JENKINS-66491]
[Epic: JENKINS-67688]
Hello,
If the following pipeline script is run as a pipeline job in Jenkins, The instantiated class is not collected by the garbage collector:
Even after a garbage collection is run using jcmd, the instantiated class is still there:
With one class or a few classes, this issue might not lead to any problems. But our pipelines have a lot of classes and we're encountering a steady increase in the MetaSpace and OutOfMemoryExceptions (CompressedClassSpace) after every 2 to 3 weeks. Additionally the Metaspace increases by several GBs per week to about 10 GB in 2 to 3 weeks.
Some more information about the environment:
Originally reported by shoelzle, imported from: Metaspace memory leak in Pipeline when running on Java 11 (but not 8 or 17)