KostyaSha / yet-another-docker-plugin

Jenkins Yet Another Docker Plugin
https://plugins.jenkins.io/yet-another-docker-plugin
MIT License
83 stars 48 forks source link

"Failed to run container" errors when cloud host is running Docker 20.10.6 #295

Open wmorrell opened 3 years ago

wmorrell commented 3 years ago

Primary Jenkins node is running: Jenkins 2.277.3 (current LTS) YAD 0.2.0 Clouds configured to launch on another host, and connect to container with Docker SSH Computer Launcher

Cloud hosts are running: Docker 20.10.6

Provisioning with prior Docker versions, up to and including 20.10.5, works fine. As soon as the cloud node patches to 20.10.6, provisioning attempts fail. There is a work-around. Re-configure the cloud container launch settings, with Create Container Settings edited to include 0.0.0.0::22 under "Port bindings".

When this error triggers, provisioning attempts will start, then repeatedly fail. They will show the following error in the Cloud Statistics listing:

java.lang.IllegalStateException: Failed to run container.
    at com.github.kostyasha.yad.DockerCloud.provisionWithWait(DockerCloud.java:257)
    at com.github.kostyasha.yad.DockerCloud.lambda$provision$0(DockerCloud.java:135)
    at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
    at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:80)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

In the Jenkins logs, there's also this:

Apr 23, 2021 5:06:21 AM FINE com.github.kostyasha.yad.launcher.DockerComputerSSHLauncher waitUp
TCP connection attempt failed 60 retries with 2 second interval for [::]:49163

Basically what I think is happening, per moby/moby#42313, is that Docker is now returning both an IPv4 and an IPv6 port when inspecting port bindings on a container. In Docker 20.10.5 and earlier, only the IPv4 port would be listed. When YAD is inspecting the ports to create the SSH url, it ends up grabbing the IPv6 address. If IPv6 is not configured on the host, or the IPv6 traffic is otherwise blocked, the DockerComputerSSHLauncher will eventually reach a connection timeout and fail the provisioning attempt. The work-around listed above works by forcing the container to launch with only IPv4 addresses, so YAD doesn't see the IPv6 address.

tghastings commented 3 years ago

Thank you for this. I spent a few hours trying to troubleshoot.