testcontainers / testcontainers-java

Testcontainers is a Java library that supports JUnit tests, providing lightweight, throwaway instances of common databases, Selenium web browsers, or anything else that can run in a Docker container.
https://testcontainers.org
MIT License
7.97k stars 1.64k forks source link

[Bug]: Testcontainers forces Ryuk to use Unix sockets as Docker host, breaking environments which use a TCP socket #9137

Open OHermesJunior opened 3 weeks ago

OHermesJunior commented 3 weeks ago

Module

Core

Testcontainers version

1.20.1

Using the latest Testcontainers version?

Yes

Host OS

Linux

Host Arch

amd64

Docker version

Client: Docker Engine - Community
 Version:           26.1.3
 API version:       1.45
 Go version:        go1.21.10
 Git commit:        b72abbb
 Built:             Thu May 16 08:33:29 2024
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          26.1.3
  API version:      1.45 (minimum version 1.24)
  Go version:       go1.21.10
  Git commit:       8e96db1
  Built:            Thu May 16 08:33:29 2024
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.32
  GitCommit:        8b3b7ca2e5ce38e8f31a34f35b2b68ceb8470d89
 runc:
  Version:          1.1.12
  GitCommit:        v1.1.12-0-g51d5e94
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

What happened?

TestContainers work correctly with a TCP socket as Docker host, which we use in our test environment. But Ryuk never did, so we just disabled it for a few years, assuming it didn't support it.

Reinvestigating the issue now, checking Ryuk's code, it seems like it should support it: https://github.com/testcontainers/moby-ryuk/blob/main/main.go#L120 The docker client gets it's configuration from the environment, and respects DOCKER_HOST.

On the TestContainers side, looks like it simply mounts the UNIX socket, regardless of what the framework itself is using: https://github.com/testcontainers/testcontainers-java/blob/251bca51f4d047d8275b04477273f36de1da7326/core/src/main/java/org/testcontainers/utility/RyukContainer.java#L21-L25

I believe it should be possible to map it to use the TCP socket, and the mismatch between the configuration used by the framework and Ryuk can be considered a bug.

I don't have a POC though, so any ideas on the best approach to do this would be great.

Relevant log output

o.testcontainers.DockerClientFactory - Testcontainers version: 1.20.1
o.t.d.DockerClientProviderStrategy - Found Docker environment with Environment variables, system properties and defaults. Resolved dockerHost=tcp://127.0.0.1:2375
o.testcontainers.DockerClientFactory - Docker host IP address is 127.0.0.1
o.testcontainers.DockerClientFactory - Connected to docker: 
  Server Version: 26.1.3
  API Version: 1.45
  Operating System: Ubuntu 22.04.4 LTS
  Total Memory: 39893 MB
tc.testcontainers/ryuk:0.8.1 - Creating container for image: testcontainers/ryuk:0.8.1
o.t.utility.RegistryAuthLocator - Failure when attempting to lookup auth config. Please ignore if you don't have images in an authenticated registry. Details: (dockerImageName: testcontainers/ryuk:0.8.1, configFile: /home/hmontei00@cloud.corp.im/.docker/config.json, configEnv: DOCKER_AUTH_CONFIG). Falling back to docker-java default behaviour. Exception message: Status 404: No config supplied. Checked in order: /home/hmontei00@cloud.corp.im/.docker/config.json (file not found), DOCKER_AUTH_CONFIG (not set)
tc.testcontainers/ryuk:0.8.1 - Container testcontainers/ryuk:0.8.1 is starting: cce423cf40435e6f2d58574dac21e0580b62e170513ff2acaa31baa336af8e68
tc.testcontainers/ryuk:0.8.1 - Could not start container org.testcontainers.containers.ContainerLaunchException: Timed out waiting for log output matching '.*Started.*'
    at org.testcontainers.containers.wait.strategy.LogMessageWaitStrategy.waitUntilReady(LogMessageWaitStrategy.java:47)
    at org.testcontainers.containers.wait.strategy.AbstractWaitStrategy.waitUntilReady(AbstractWaitStrategy.java:52)
    at org.testcontainers.containers.GenericContainer.waitUntilContainerStarted(GenericContainer.java:909)
    at org.testcontainers.containers.GenericContainer.tryStart(GenericContainer.java:500)
    at org.testcontainers.containers.GenericContainer.lambda$doStart$0(GenericContainer.java:354)
    at org.rnorth.ducttape.unreliables.Unreliables.retryUntilSuccess(Unreliables.java:81)
    at org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:344)
    at org.testcontainers.containers.GenericContainer.start(GenericContainer.java:330)
    at org.testcontainers.utility.RyukResourceReaper.maybeStart(RyukResourceReaper.java:78)
    at org.testcontainers.utility.RyukResourceReaper.init(RyukResourceReaper.java:42)
    at org.testcontainers.DockerClientFactory.client(DockerClientFactory.java:232)
    at org.testcontainers.DockerClientFactory$1.getDockerClient(DockerClientFactory.java:106)
    at com.github.dockerjava.api.DockerClientDelegate.authConfig(DockerClientDelegate.java:109)
    at org.testcontainers.containers.GenericContainer.start(GenericContainer.java:329)

tc.testcontainers/ryuk:0.8.1 - There are no stdout/stderr logs available for the failed container

Additional Information

No response

eddumelendez commented 3 weeks ago

Hi, have you tried using TESTCONTAINERS_DOCKER_SOCKET_OVERRIDE and TESTCONTAINERS_HOST_OVERRIDE? See docs about the env vars. Our docs show some examples about how to use it with other container runtimes but it will help you for your setup.

OHermesJunior commented 3 weeks ago

Hi, thanks for the response!

TESTCONTAINERS_HOST_OVERRIDE had the same problem

I am not sure what I should pass to TESTCONTAINERS_DOCKER_SOCKET_OVERRIDE, as I understand it, it expects a unix socket file path. Not sure I can get it with a TCP socket.

Tested with 127.0.0.1, tcp://127.0.0.1, tcp://127.0.0.1:2375, and got errors like:

tc.testcontainers/ryuk:0.8.1 - Could not start container com.github.dockerjava.api.exception.InternalServerErrorException: Status 500: {"message":"invalid volume specification: 'tcp://127.0.0.1:2375:/var/run/docker.sock:rw'"}

Any idea of what I could use?

OHermesJunior commented 3 weeks ago

To run containers with access to the Docker server we usually run it with --net=host --cap-add=NET_ADMIN --device=/dev/net/tun

I tried modifying RyukContainer with withAccessToHost(true) and Testcontainers.exposeHostPorts(2375), or withNetworkMode("host") but it didn't work either. I am not very familiar with those APIs, so I might be doing something wrong.

OHermesJunior commented 6 days ago

An update on this, I found #5151 which explain a little why ryuk cannot connect to host. I believe this is a strong use case to support, but maybe the decision has been made.