timja / jenkins-gh-issues-poc-06-18

0 stars 0 forks source link

[JENKINS-66091] Parameterized Trigger intermittently hangs on wait() call from AsyncFutureImpl.java:79 #1584

Open timja opened 3 years ago

timja commented 3 years ago

We use the Parameterized Trigger plugin to launch a downstream Pipeline job (child) from an upstream Freestyle job (parent). The parent is configured to launch the child on the same single-executor node as the parent. Intermittently, the parent will hang waiting for the child to return control to it after the child completes.

If we access /threadDump for our Jenkins server while a hang is in progress, we see a stack trace like the following for the hanging executor thread:

"Executor #0 for iibdev-x-rhel-hur-fyre-discard-87671 : executing ib000_Linux_PPCLE_Packager #2489" Id=101251 Group=main WAITING on hudson.model.queue.FutureImpl@278f4262
 at java.lang.Object.wait(Native Method)
 - waiting on hudson.model.queue.FutureImpl@278f4262
 at java.lang.Object.wait(Object.java:502)
 at hudson.remoting.AsyncFutureImpl.get(AsyncFutureImpl.java:79)
 at hudson.plugins.parameterizedtrigger.TriggerBuilder.perform(TriggerBuilder.java:155)
 at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
 at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:803)
 at hudson.model.Build$BuildExecution.build(Build.java:197)
 at hudson.model.Build$BuildExecution.doRun(Build.java:163)
 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:513)
 at hudson.model.Run.execute(Run.java:1907)
 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
 at hudson.model.ResourceController.execute(ResourceController.java:97)
 at hudson.model.Executor.run(Executor.java:429)

The problem seems to be a wait() call that never returns from AsyncFutureImpl.java:79 (Jenkins core) to TriggerBuilder.java:155 (Parameterized Trigger plugin).

Bizarrely, the hanging continues even after the node running the parent has gone offline:

The problem has been reproduced on various Linux and Windows nodes: Red Hat Enterprise Linux 7.9 (x86_64, s390x, ppc64le) and Windows 2012 R2 (x86_64).

The problem is very intermittent and difficult to reproduce - it has taken several weeks' worth of jobs running every minute of the day to reproduce the issue a handful of times. The problem has been reproduced with the following job pairing (assume all configuration options except those mentioned below left as the default):


Originally reported by nan, imported from: Parameterized Trigger intermittently hangs on wait() call from AsyncFutureImpl.java:79
  • status: Open
  • priority: Major
  • resolution: Unresolved
  • imported: 2022/01/10
timja commented 3 years ago

nan:

Still happening with Jenkins v2.289.2.