Open timja opened 4 years ago
Hi thanks for the report!
> A possible cause of this might be {{AsyncResourceDisposer.worker}}s that run into errors due to insufficient file permissions for the user (jenkins) that runs the Jenkins daemon. It seems that these workers are unable to recover and skip the offending files and directories.
It is plausible, but it needs more investigation and analysis. One particular stack trace might be a red herring. If you have a test instance where you could reproduce the behavior, connecting a profiler would be a good step to collect the details (e.g. Java Flight Recorded). Tools like jvmtop or Monitoring plugin could also help to collect some per-thread diagnostics information from the instance.
> As a sidenote, I did not find the error in Jenkin's log files, which made it difficult to track down possible reasons
it is possible to go through the code in the stacktrace and discover what exactly suppresses it
Jenkins consumes vast amounts of CPU resources. E.g. on one machine during 103 days uptime htop reports 7387 CPU hours, i.e. in the mean 3 out of 8 cores are 100% utilzed during uptime. Actually it seems to be worse:
A possible cause of this might be {{AsyncResourceDisposer.worker}}s that run into errors due to insufficient file permissions for the user (jenkins) that runs the Jenkins daemon. It seems that these workers are unable to recover and skip the offending files and directories.
In our specific case the files have been created by Docker containers, but it may have other causes as well, so it's probably a quite common source of problems.
As a sidenote, I did not find the error in Jenkin's log files, which made it difficult to track down possible reasons for the bad Jenkins performance and Jobs that failed for no good reasons.
Originally reported by alexanderv, imported from: Extreme CPU utilization leading to bad performance and failing jobs