This has been happening for a long time, several versions of Jenkins and combinations of plug-ins. In this case, Jenkins 2.303.1, and version 1.11 of the swarm plug-in.
What Operating System are you using (both controller, and any agents involved in the problem)?
Ubuntu 16, and agents are a variety of Ubuntu versions (16/18/20)
Reproduction steps
Setup a job which uses a swarm template
Put Jenkins in shutdown mode
Trigger the job. The job will be held, and the agent will start
After 2-3 minutes, Jenkins takes the node down because it is idle
Cancel shutdown mode.
Results
Expected result:
Build runs on agent.
Actual result:
Build never happens, and new builds can't run until it is manually cancelled.
Possible Solution
I made the following change to prevent calling done(c). It results in the agent being taken down and a new one being created. This will repeat until Jenkins shutdown is cancelled:
index 31e4878..41cc23d 100644
--- a/src/main/java/org/jenkinsci/plugins/docker/swarm/DockerSwarmAgentRetentionStrategy.java
+++ b/src/main/java/org/jenkinsci/plugins/docker/swarm/DockerSwarmAgentRetentionStrategy.java
@@ -19,6 +19,7 @@ import hudson.model.Executor;
import hudson.model.ExecutorListener;
import hudson.model.Queue;
import hudson.slaves.RetentionStrategy;
+import jenkins.model.Jenkins;
public class DockerSwarmAgentRetentionStrategy extends RetentionStrategy<DockerSwarmComputer>
implements ExecutorListener {
@@ -45,7 +46,7 @@ public class DockerSwarmAgentRetentionStrategy extends RetentionStrategy<DockerS
final long connectTime = System.currentTimeMillis() - c.getConnectTime();
final long idleTime = System.currentTimeMillis() - c.getIdleStartMilliseconds();
final boolean isTimeout = connectTime > timeout && idleTime > timeout;
- if (isTimeout && (!isTaskAccepted || isTaskCompleted)) {
+ if (isTimeout && (!isTaskAccepted || isTaskCompleted ) && !Jenkins.getInstance().isQuietingDown()) {
LOGGER.log(Level.INFO, "Disconnecting due to idle {0}", c.getName());
done(c);
}```
I don't know enough about the interactions with the caller, so this may not be the most optimal solution.
Version report
Jenkins and plugins versions report:
This has been happening for a long time, several versions of Jenkins and combinations of plug-ins. In this case, Jenkins 2.303.1, and version 1.11 of the swarm plug-in.
Ubuntu 16, and agents are a variety of Ubuntu versions (16/18/20)
Reproduction steps
Results
Expected result:
Build runs on agent.
Actual result:
Build never happens, and new builds can't run until it is manually cancelled.
Possible Solution
I made the following change to prevent calling done(c). It results in the agent being taken down and a new one being created. This will repeat until Jenkins shutdown is cancelled: