jenkinsci / docker-swarm-plugin

Jenkins plugin which allows to add a Docker Swarm as a cloud agent provider
https://plugins.jenkins.io/docker-swarm/
MIT License
55 stars 49 forks source link

Stuck setting up #48

Open rlogiacco opened 5 years ago

rlogiacco commented 5 years ago

I managed to install and (hopefully) configure the plugin: I'm now able to see the Swarm Dashboard and all my nodes are listed, but there is where I'm now stuck at.

What I tried is to use the label defined in the template (docker-slave) as the value for the Pipeline Model Definition > Docker Label, but this puts all my builds on-hold pending for a agt-shared_master_6-2 slave which never gets started:

Started by user admin
 > git rev-parse --is-inside-work-tree # timeout=10
Setting origin to http://vcs.example.com/Docker/shared.git
 > git config remote.origin.url http://vcs.example.com/Docker/shared.git # timeout=10
Fetching origin...
Fetching upstream changes from origin
 > git --version # timeout=10
 > git config --get remote.origin.url # timeout=10
using GIT_ASKPASS to set credentials Git username/password for http://vcs.example.com/Applications/eliza.git
Setting http proxy: 10.235.34.235:3129
 > git fetch --tags --force --progress -- origin +refs/heads/*:refs/remotes/origin/*
Seen branch in repository origin/master
Seen 1 remote branch
Obtained Jenkinsfile from 43d370e49e0cdc972972f6603a92d076a0c235fb
Running in Durability level: MAX_SURVIVABILITY
[Pipeline] Start of Pipeline
[Pipeline] node
Still waiting to schedule task
‘agt-hared_master_10-6’ is offline

On the Jenkins logs I can't find any valuable information:

2019-09-16T11:36:55.819080338Z Sep 16, 2019 11:36:55 AM org.jenkinsci.plugins.docker.swarm.docker.api.request.ApiRequest handleFailure
2019-09-16T11:36:55.819149887Z WARNING: API Request response status 500. Message: Internal Server Error
2019-09-16T11:38:37.735158878Z Sep 16, 2019 11:38:37 AM hudson.model.AsyncPeriodicWork$1 run
2019-09-16T11:38:37.735225191Z INFO: Started DockerContainerWatchdog Asynchronous Periodic Work
2019-09-16T11:38:37.736831167Z Sep 16, 2019 11:38:37 AM com.nirima.jenkins.plugins.docker.DockerContainerWatchdog execute
2019-09-16T11:38:37.736849963Z INFO: Docker Container Watchdog has been triggered
2019-09-16T11:38:37.741127249Z Sep 16, 2019 11:38:37 AM com.nirima.jenkins.plugins.docker.DockerContainerWatchdog$Statistics writeStatisticsToLog
2019-09-16T11:38:37.741170485Z INFO: Watchdog Statistics: Number of overall executions: 0, Executions with processing timeout: 0, Containers removed gracefully: 0, Containers removed with force: 0, Containers removal failed: 0, Nodes removed successfully: 0, Nodes removal failed: 0, Container removal average duration (gracefully): 0 ms, Container removal average duration (force): 0 ms, Average overall runtime of watchdog: 0 ms, Average runtime of container retrieval: 0 ms
2019-09-16T11:38:37.742026929Z Sep 16, 2019 11:38:37 AM com.nirima.jenkins.plugins.docker.DockerContainerWatchdog loadNodeMap
2019-09-16T11:38:37.742058801Z INFO: We currently have 5 nodes assigned to this Jenkins instance, which we will check
2019-09-16T11:38:37.742968419Z Sep 16, 2019 11:38:37 AM com.nirima.jenkins.plugins.docker.DockerContainerWatchdog execute
2019-09-16T11:38:37.742990195Z INFO: Docker Container Watchdog check has been completed

Same applies for the Jenkins web interface which hangs indefinitely:

[11:36:55 AM] Creating Service with Name : agt-hared_master_10-6

I've double checked the TCP connection is open and responding:

Client: Docker Engine - Community
 Version:           19.03.2
 API version:       1.40
 Go version:        go1.12.8
 Git commit:        6a30dfc
 Built:             Thu Aug 29 05:28:55 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.2
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.12.8
  Git commit:       6a30dfc
  Built:            Thu Aug 29 05:27:34 2019
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.2.6
  GitCommit:        894b81a4b802e4eb2a91d1ce216b8817763c29fb
 runc:
  Version:          1.0.0-rc8
  GitCommit:        425e105d5a03fabd737a126ad93d62a9eeede87f
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

Now, either there's something I haven't understood or I'm using this plugin in the wrong way: in both cases, I'm seeking for help...

Roemer commented 5 years ago

Did you check the service log in docker? If for example the agent is unable to pull the requested image, it shows the error only in the docker service log for a short time before the failed service disappears. In Jenkins, this currently waits indefinitely for the never running agent.