jenkinsci / jenkins-test-harness

Unit test framework for Jenkins core and its plugins
https://www.jenkins.io/doc/developer/testing/
MIT License
109 stars 126 forks source link

`doSafeExit` from `RealJenkinsRule` could hang #846

Closed jglick closed 3 weeks ago

jglick commented 3 weeks ago

I noticed some flakes in a CloudBees CI test which just showed Jenkins failing to terminate within 60s, for no clear reason. I think what was happening was that Jenkins.doQuietDown was hanging, presumably because something was still running and would not quit; stopJenkins was probably being called in response to some test failure, but whatever that was, it was lost. I think stopJenkins ought to consistently shut down the controller cleanly and cap the amount of time it will wait for “quiet down” to complete. Note that this quiet down behavior seems to have been introduced in #281 and was then only temporarily suppressed in #544.

Test before fix:

…
2024-09-26 18:20:04.696+0000 [id=164]   WARNING o.j.r.u.AnonymousClassWarnings#warn: Attempt to (de-)serialize anonymous class org.jvnet.hudson.test.RealJenkinsRuleTest$3 in file:…/target/test-classes/; see: https://jenkins.io/redirect/serialization-of-anonymous-classes/
2024-09-26 18:20:04.751+0000 [id=168]   INFO    hudson.lifecycle.Lifecycle#onStop: Stopping Jenkins as requested by SYSTEM
Jenkins failed to stop within 60 seconds, attempting to kill the Jenkins process
Sep 26, 2024 2:21:04 PM org.jvnet.hudson.test.TemporaryDirectoryAllocator dispose
INFO: deleting /tmp/webroot29664975752984931
Sep 26, 2024 2:21:04 PM org.jvnet.hudson.test.TemporaryDirectoryAllocator dispose
INFO: deleting /tmp/pluginroot3963307286416352727
Sep 26, 2024 2:21:04 PM org.jvnet.hudson.test.TemporaryDirectoryAllocator dispose
INFO: deleting /tmp/RealJenkinsRule5508716196234467648
Sep 26, 2024 2:21:04 PM org.jvnet.hudson.test.TemporaryDirectoryAllocator dispose
INFO: deleting /tmp/j h2525127732088122978
Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 64.03 s <<< FAILURE! -- in org.jvnet.hudson.test.RealJenkinsRuleTest
org.jvnet.hudson.test.RealJenkinsRuleTest.safeExit -- Time elapsed: 64.02 s <<< FAILURE!
java.lang.AssertionError: Jenkins failed to terminate within 60 seconds
    at org.jvnet.hudson.test.RealJenkinsRule.stopJenkins(RealJenkinsRule.java:978)
    at org.jvnet.hudson.test.RealJenkinsRule.then(RealJenkinsRule.java:720)
    at org.jvnet.hudson.test.RealJenkinsRule.then(RealJenkinsRule.java:709)
    at org.jvnet.hudson.test.RealJenkinsRuleTest.safeExit(RealJenkinsRuleTest.java:377)
    at java.base/java.lang.reflect.Method.invoke(Method.java:580)
    at org.jvnet.hudson.test.RealJenkinsRule$1.evaluate(RealJenkinsRule.java:527)

After:

…
2024-09-26 18:29:26.503+0000 [id=164]   WARNING o.j.r.u.AnonymousClassWarnings#warn: Attempt to (de-)serialize anonymous class org.jvnet.hudson.test.RealJenkinsRuleTest$3 in file:…/target/test-classes/; see: https://jenkins.io/redirect/serialization-of-anonymous-classes/
2024-09-26 18:29:56.571+0000 [id=74]    INFO    hudson.lifecycle.Lifecycle#onStop: Stopping Jenkins as requested by RealJenkinsRule
2024-09-26 18:29:56.572+0000 [id=74]    INFO    hudson.lifecycle.Lifecycle#onStatusUpdate: Stopping Jenkins
2024-09-26 18:29:56.583+0000 [id=150]   INFO    hudson.model.Run#execute: test0 #1 aborted
java.lang.InterruptedException: sleep interrupted
    at java.base/java.lang.Thread.sleep0(Native Method)
    at java.base/java.lang.Thread.sleep(Thread.java:509)
    at org.jvnet.hudson.test.RealJenkinsRuleTest$3.perform(RealJenkinsRuleTest.java:385)
    at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
    at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:818)
    at hudson.model.Build$BuildExecution.build(Build.java:199)
    at hudson.model.Build$BuildExecution.doRun(Build.java:164)
    at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:527)
    at hudson.model.Run.execute(Run.java:1908)
    at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:44)
    at hudson.model.ResourceController.execute(ResourceController.java:101)
    at hudson.model.Executor.run(Executor.java:445)
2024-09-26 18:29:56.598+0000 [id=74]    INFO    hudson.lifecycle.Lifecycle#onStatusUpdate: Jenkins stopped
2024-09-26 18:29:56.651+0000 [id=66]    INFO    winstone.Logger#logInternal: JVM is terminating. Shutting down Jetty
2024-09-26 18:29:56.652+0000 [id=66]    INFO    org.eclipse.jetty.server.Server#doStop: Stopped oejs.Server@62010f5c{STOPPING}[12.0.13,sto=0]
2024-09-26 18:29:56.656+0000 [id=150]   WARNING jenkins.util.Listeners#lambda$notify$0
java.lang.IllegalStateException: Jenkins.instance is missing. Read the documentation of Jenkins.getInstanceOrNull to see what you are doing wrong.
    at jenkins.model.Jenkins.get(Jenkins.java:843)
    at jenkins.triggers.ReverseBuildTrigger$RunListenerImpl.calculateCache(ReverseBuildTrigger.java:250)
    at jenkins.triggers.ReverseBuildTrigger$RunListenerImpl.onCompleted(ReverseBuildTrigger.java:276)
    at hudson.model.listeners.RunListener.lambda$fireCompleted$0(RunListener.java:223)
    at jenkins.util.Listeners.lambda$notify$0(Listeners.java:59)
    at jenkins.util.Listeners.notify(Listeners.java:67)
    at hudson.model.listeners.RunListener.fireCompleted(RunListener.java:221)
    at hudson.model.Run.execute(Run.java:1949)
    at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:44)
    at hudson.model.ResourceController.execute(ResourceController.java:101)
    at hudson.model.Executor.run(Executor.java:445)
2024-09-26 18:29:56.661+0000 [id=66]    INFO    o.e.j.server.AbstractConnector#doStop: Stopped ServerConnector@238acd0b{HTTP/1.1, (http/1.1)}{127.0.0.1:0}
2024-09-26 18:29:56.670+0000 [id=66]    INFO    winstone.Logger#logInternal: Jetty shutdown successfully
Sep 26, 2024 2:29:57 PM org.jvnet.hudson.test.TemporaryDirectoryAllocator dispose
INFO: deleting /tmp/RealJenkinsRule11363000916226251019
Sep 26, 2024 2:29:57 PM org.jvnet.hudson.test.TemporaryDirectoryAllocator dispose
INFO: deleting /tmp/pluginroot4193062836092557992
Sep 26, 2024 2:29:57 PM org.jvnet.hudson.test.TemporaryDirectoryAllocator dispose
INFO: deleting /tmp/webroot3169670065018125573
Sep 26, 2024 2:29:57 PM org.jvnet.hudson.test.TemporaryDirectoryAllocator dispose
INFO: deleting /tmp/j h17078518161949828534
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 34.58 s -- in org.jvnet.hudson.test.RealJenkinsRuleTest