timja / jenkins-gh-issues-poc-06-18

0 stars 0 forks source link

[JENKINS-29888] Jenkins stuck when concurrent builds of a same job due to log rotator #7422

Open timja opened 9 years ago

timja commented 9 years ago

Hello,

I use Jenkins with parallel builds to checkout many packages.

Example:

parallel (
build(build("CheckOut-Package", PACKAGE_NAME: "Package1");
build(build("CheckOut-Package", PACKAGE_NAME: "Package2");
...
build(build("CheckOut-Package", PACKAGE_NAME: "Package10");
)

Each of these jobs calls the log rotator when finished.
Some time, next log appears:

SEVERE: Executor threw an exception
java.util.NoSuchElementException
at jenkins.model.lazy.LazyLoadRunMapEntrySet$1.next(LazyLoadRunMapEntrySet.java:76)
at jenkins.model.lazy.LazyLoadRunMapEntrySet$1.next(LazyLoadRunMapEntrySet.java:63)
at java.util.AbstractMap$2$1.next(AbstractMap.java:385)
at hudson.util.RunList.subList(RunList.java:139)
at hudson.tasks.LogRotator.perform(LogRotator.java:125)
at hudson.model.Job.logRotate(Job.java:467)
at hudson.model.Run.execute(Run.java:1808)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:98)
at hudson.model.Executor.run(Executor.java:374)

In this case, Jenkins does not detect that the job is finished even if it is indeed finished, and stay stucked.
Since log Rotator should not be a blocking task, I added an exception catch in this PR:
https://github.com/jenkinsci/jenkins/pull/1790

That was enough on my side to handle this very recurrent problem.


Originally reported by totoman, imported from: Jenkins stuck when concurrent builds of a same job due to log rotator
  • status: Reopened
  • priority: Critical
  • resolution: Unresolved
  • imported: 2022/01/10
timja commented 9 years ago

scm_issue_link:

Code changed in jenkins
User: tfennelly
Path:
.gitignore
changelog.html
cli/pom.xml
core/pom.xml
core/src/main/java/hudson/model/AbstractProject.java
core/src/main/java/hudson/model/Run.java
core/src/main/resources/hudson/widgets/HistoryWidget/index.jelly
core/src/main/resources/lib/form/confirm.js
core/src/main/resources/lib/form/select/select.js
core/src/main/resources/lib/layout/layout.jelly
plugins/pom.xml
pom.xml
test/pom.xml
test/src/test/groovy/hudson/model/AbstractProjectTest.groovy
war/pom.xml
http://jenkins-ci.org/commit/jenkins/95ca3da67d217c90d31819ec92e521e2072acd5a
Log:
Merge branch 'master' into plugin-manager-dependants

timja commented 9 years ago

scm_issue_link:

Code changed in jenkins
User: Otmane TAZI
Path:
core/src/main/java/hudson/model/Run.java
http://jenkins-ci.org/commit/jenkins/1a5d57fdda3dba92543b8f486d3dce9cfb9c3811
Log:
[FIXED JENKINS-29888] Handling all exceptions returned by logRotator

timja commented 9 years ago

scm_issue_link:

Code changed in jenkins
User: Oleg Nenashev
Path:
core/src/main/java/hudson/model/Run.java
http://jenkins-ci.org/commit/jenkins/876e8e20505b705d07b561b34863938872240369
Log:
Merge pull request #1790 from tototoman/master

[FIXED JENKINS-29888] - Handling all LogRotator exceptions

timja commented 9 years ago

scm_issue_link:

Code changed in jenkins
User: Oleg Nenashev
Path:
changelog.html
http://jenkins-ci.org/commit/jenkins/a79e2f144a932fe0a97b524f79133d1ea5d0fef0
Log:
Update the changelog by new merges:

timja commented 8 years ago

thors:

Same issue occured two times just today on our Jenkins server.
Version: 1.625.3 (LTS)

From jenkins log:
Jan 15, 2016 1:33:38 PM hudson.model.Executor finish1
SEVERE: Executor threw an exception
java.util.NoSuchElementException
at jenkins.model.lazy.LazyLoadRunMapEntrySet$1.next(LazyLoadRunMapEntrySet.java:76)
at jenkins.model.lazy.LazyLoadRunMapEntrySet$1.next(LazyLoadRunMapEntrySet.java:63)
at java.util.AbstractMap$2$1.next(AbstractMap.java:385)
at hudson.util.RunList.subList(RunList.java:139)
at hudson.tasks.LogRotator.perform(LogRotator.java:125)
at hudson.model.Job.logRotate(Job.java:467)
at hudson.model.Run.execute(Run.java:1805)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:98)
at hudson.model.Executor.run(Executor.java:408)

Jan 15, 2016 11:27:15 AM hudson.model.Executor finish1
SEVERE: Executor threw an exception
java.util.NoSuchElementException
at jenkins.model.lazy.LazyLoadRunMapEntrySet$1.next(LazyLoadRunMapEntrySet.java:76)
at jenkins.model.lazy.LazyLoadRunMapEntrySet$1.next(LazyLoadRunMapEntrySet.java:63)
at java.util.AbstractMap$2$1.next(AbstractMap.java:385)
at hudson.util.RunList.subList(RunList.java:139)
at hudson.tasks.LogRotator.perform(LogRotator.java:125)
at hudson.model.Job.logRotate(Job.java:467)
at hudson.model.Run.execute(Run.java:1805)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:98)
at hudson.model.Executor.run(Executor.java:408)

timja commented 8 years ago

tib:

Happens somewhat frequently for us as well. Very annoying.
(Jenkins ver. 1.625.2)

timja commented 8 years ago

lcary:

For a long time, this issue was making our CI cluster less reliable. We were on jenkins 1.625.2, and our build flow jobs would often hang when running in parallel. This was a bad user experience that made our CI pipelines flaky, since it either caused timeouts or required that we abort the build.

This was fixed for my team by upgrading to the latest stable version of jenkins core (1.651.1). I believe any version of jenkins before https://github.com/jenkinsci/jenkins/pull/1790/files was merged will have this bug, but upgrading to a newer jenkins version (any version after 1.633) should make the exception non-fatal and thus address the issue.

timja commented 7 years ago

ptha:

Thanks lcary upgrading from 1.587 to 1.643 resolved this for me.