testcontainers / testcontainers-java

Testcontainers is a Java library that supports JUnit tests, providing lightweight, throwaway instances of common databases, Selenium web browsers, or anything else that can run in a Docker container.
https://testcontainers.org
MIT License
8.04k stars 1.65k forks source link

[Bug]: Intermittent Startup Failure Using Docker Compose (java.net.SocketException: Unexpected end of file from server) #9212

Closed danielmustafa closed 2 months ago

danielmustafa commented 2 months ago

Module

Core

Testcontainers version

1.20.1

Using the latest Testcontainers version?

Yes

Host OS

MacOS 14.6.1 (23G93)

Host Arch

arm64

Docker version

Client:
 Cloud integration: v1.0.35+desktop.11
 Version:           25.0.3
 API version:       1.44
 Go version:        go1.21.6
 Git commit:        4debf41
 Built:             Tue Feb  6 21:13:26 2024
 OS/Arch:           darwin/arm64
 Context:           default

Server:
 Engine:
  Version:          26.1.3
  API version:      1.45 (minimum version 1.24)
  Go version:       go1.22.5
  Git commit:       8e96db1c328d0467b015768e42a62c0f834970bb
  Built:            Sun Jul  7 17:34:20 2024
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          v1.7.17
  GitCommit:        3a4de459a68952ffb703bbe7f2290861a75b6b67
 runc:
  Version:          1.1.12
  GitCommit:        51d5e94601ceffbbd85688df1c928ecccbfa4685
 docker-init:
  Version:          0.19.0
  GitCommit:

What happened?

Hi all. Using the most basic docker-compose file in my Spring Boot (3.2.1) (jdk17) integration test, I am able to successfully run my test successfully about 80% of the time. But every so often, I get an initialization exception due to an apparent socket timeout. Very similar to this issue

docker-compose.yml:

version: "3.8"
services:
  wiremock:
    environment:
      BASE_WORKING_DIR: $PWD
      WIREMOCK_ROOT_DIR: /base/src/test/resources
    ports:
      - "8099:8080"
    image: wiremock:3.0.4
    volumes:
      - type: bind
        source: ${BASE_WORKING_DIR}
        target: /base
    command: [ "/docker-entrypoint.sh","--verbose", "--root-dir", "${WIREMOCK_ROOT_DIR}" ]

ContainerConfiguration:

public abstract class ContainerConfiguration {

   protected static ComposeContainer COMPOSE_CONTAINER;
   protected static final String WIREMOCK_SERVICE = "wiremock";
   protected static final int WIREMOCK_PORT = 8099;
   static {
      boolean useLocalCompose = Boolean.parseBoolean(System.getProperty("useLocalCompose", "true"));
      COMPOSE_CONTAINER =
            new ComposeContainer(new File("src/test/resources/docker-compose.yml"))
                  .withServices(WIREMOCK_SERVICE)
                  .withLogConsumer(WIREMOCK_SERVICE, new Slf4jLogConsumer(log))
                  .withExposedService(WIREMOCK_SERVICE, WIREMOCK_PORT,
                        Wait.forHttp("/__admin/mappings").forStatusCode(200).withReadTimeout(Duration.ofSeconds(30)))
                  .withEnv("WIREMOCK_ROOT_DIR", "/base/src/test/resources")
                  .withLocalCompose(useLocalCompose);
      COMPOSE_CONTAINER.start();
}

The container spins up just fine and I'm able to access the wiremock endpoints through curl. In the test logs, it seems to get stuck immediately on the wait strategy (see log output). I've tried using a wait strategy on the listening port, on an HTTP endpoint, and all seem to end up having the same problem.

I inspected the socat logs and observed that it repeatedly had connection refused errors, so it looks like for whatever reason it's never able to establish the proxy in the socat container:

docker logs testcontainers-socat-KfsxaUqC -f
2024/09/09 22:27:09 socat[14] E connect(5, AF=2 192.168.64.2:8099, 16): Connection refused
2024/09/09 22:27:09 socat[15] E connect(5, AF=2 192.168.64.2:8099, 16): Connection refused
2024/09/09 22:27:09 socat[16] E connect(5, AF=2 192.168.64.2:8099, 16): Connection refused
...

This intermittent failure is happening on my machine, my workers' machines, and in our CI environment. It's very frustrating as it's causing flakiness in our builds, when we know the tests work just fine (when the containers spin up successfully). We have other containers in the compose but it only seems to be happening to this wiremock container.

Relevant log output

15:28:55.657 [ducttape-0] INFO  o.t.c.wait.strategy.HttpWaitStrategy - /1wrynzawugyi-wiremock-1: Waiting for 60 seconds for URL: http://localhost:32929/__admin/mappings (where port 32929 maps to container port 8099)
15:28:55.683 [ducttape-1] TRACE o.r.ducttape.unreliables.Unreliables - Retrying lambda call on attempt 0
15:28:56.703 [ducttape-1] TRACE o.r.ducttape.unreliables.Unreliables - Retrying lambda call on attempt 1
15:28:57.725 [ducttape-1] TRACE o.r.ducttape.unreliables.Unreliables - Retrying lambda call on attempt 2
15:28:58.737 [ducttape-1] TRACE o.r.ducttape.unreliables.Unreliables - Retrying lambda call on attempt 3
...
java.lang.ExceptionInInitializerError
    at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
    at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:480)
    at java.base/java.util.Optional.orElseGet(Optional.java:364)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
Caused by: java.lang.RuntimeException: org.testcontainers.containers.ContainerLaunchException: Timed out waiting for URL to be accessible (http://localhost:32929/__admin/mappings should return HTTP [200])
    at org.rnorth.ducttape.timeouts.Timeouts.callFuture(Timeouts.java:68)
    at org.rnorth.ducttape.timeouts.Timeouts.doWithTimeout(Timeouts.java:60)
    at org.testcontainers.containers.wait.strategy.WaitAllStrategy.waitUntilReady(WaitAllStrategy.java:54)
    at org.testcontainers.containers.ComposeDelegate.waitUntilServiceStarted(ComposeDelegate.java:210)
    at java.base/java.util.concurrent.ConcurrentHashMap.forEach(ConcurrentHashMap.java:1603)
    at org.testcontainers.containers.ComposeDelegate.waitUntilServiceStarted(ComposeDelegate.java:184)
    at org.testcontainers.containers.ComposeContainer.start(ComposeContainer.java:143)
    at com.walmart.aex.integration.ContainerConfiguration.<clinit>(ContainerConfiguration.java:40)
    ... 5 more
Caused by: org.testcontainers.containers.ContainerLaunchException: Timed out waiting for URL to be accessible (http://localhost:32929/__admin/mappings should return HTTP [200])
    at org.testcontainers.containers.wait.strategy.HttpWaitStrategy.waitUntilReady(HttpWaitStrategy.java:320)
    at org.testcontainers.containers.wait.strategy.AbstractWaitStrategy.waitUntilReady(AbstractWaitStrategy.java:52)
    at org.testcontainers.containers.wait.strategy.WaitAllStrategy.waitUntilNestedStrategiesAreReady(WaitAllStrategy.java:66)
    at org.testcontainers.containers.wait.strategy.WaitAllStrategy.lambda$waitUntilReady$0(WaitAllStrategy.java:58)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
    at java.base/java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:264)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.base/java.lang.Thread.run(Thread.java:840)
Caused by: org.rnorth.ducttape.TimeoutException: Timeout waiting for result with exception
    at org.rnorth.ducttape.unreliables.Unreliables.retryUntilSuccess(Unreliables.java:54)
    at org.testcontainers.containers.wait.strategy.HttpWaitStrategy.waitUntilReady(HttpWaitStrategy.java:252)
    ... 9 more
Caused by: java.lang.RuntimeException: java.net.SocketException: Unexpected end of file from server
    at org.testcontainers.containers.wait.strategy.HttpWaitStrategy.lambda$null$6(HttpWaitStrategy.java:312)
    at org.rnorth.ducttape.ratelimits.RateLimiter.doWhenReady(RateLimiter.java:27)
    at org.testcontainers.containers.wait.strategy.HttpWaitStrategy.lambda$waitUntilReady$7(HttpWaitStrategy.java:257)
    at org.rnorth.ducttape.unreliables.Unreliables.lambda$retryUntilSuccess$0(Unreliables.java:43)
    ... 5 more
Caused by: java.net.SocketException: Unexpected end of file from server
    at java.base/sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:954)
    at java.base/sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:761)
    at java.base/sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:951)
    at java.base/sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:761)
    at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1688)
    at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1589)
    at java.base/java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:529)
    at org.testcontainers.containers.wait.strategy.HttpWaitStrategy.lambda$null$6(HttpWaitStrategy.java:276)
    ... 8 more

Additional Information

I also tried forcing to use the latest version of socat (1.8.0.0) but it didn't make any difference

EDIT: I created a small project on a Windows machine and am able to replicate the same behavior. This project can reproduce the issue 100% of the time (at least for me). Simply run the one test case which repeats the Compose start/stop process: https://github.com/danielmustafa/tc-test

eddumelendez commented 2 months ago

Hi, the docker-compose.yml file contains ports "8099:8080" which will expose 8099 on the host machine. ComposeContainer doesn't take into account the ports section declared in the compose file because of definitions where a fixed port is used. Looking at the logs Caused by: org.testcontainers.containers.ContainerLaunchException: Timed out waiting for URL to be accessible (http://localhost:32929/__admin/mappings should return HTTP [200]) you can confirm a random port is used instead to start Wiremock.

Also, looking at the code, WIREMOCK_PORT should be 8080 instead, current value is the fixed port 8099 on the host machine

.withExposedService(WIREMOCK_SERVICE, WIREMOCK_PORT,
                        Wait.forHttp("/__admin/mappings").forStatusCode(200).withReadTimeout(Duration.ofSeconds(30)))

You can see the log java.lang.IllegalStateException: Target port 8099 is not exposed in the project that you shared.

This is working as designed, if you apply the changes I mentioned then it should work.