testcontainers / testcontainers-java

Testcontainers is a Java library that supports JUnit tests, providing lightweight, throwaway instances of common databases, Selenium web browsers, or anything else that can run in a Docker container.
https://testcontainers.org
MIT License
8.02k stars 1.65k forks source link

Test containers stale forever with docker in docker #1473

Closed ABeltramo closed 5 years ago

ABeltramo commented 5 years ago

Running testcontainers v1.11.2 on a TeamCity worker over docker. It was working until we moved everything to a new machine (we also changed OS from Ubuntu to Debian). Now it run forever without logging anything:

Running TestDockerDBs
[main] INFO org.testcontainers.dockerclient.DockerClientProviderStrategy - Loaded org.testcontainers.dockerclient.EnvironmentAndSystemPropertyClientProviderStrategy from ~/.testcontainers.properties, will try it first
[main] INFO org.testcontainers.dockerclient.DockerClientProviderStrategy - Will use 'okhttp' transport
[main] INFO org.testcontainers.dockerclient.EnvironmentAndSystemPropertyClientProviderStrategy - Found docker client settings from environment
[main] INFO org.testcontainers.dockerclient.DockerClientProviderStrategy - Found Docker environment with Environment variables, system properties and defaults. Resolved:
    dockerHost=unix:///var/run/docker.sock
    apiVersion='{UNKNOWN_VERSION}'
    registryUrl='https://index.docker.io/v1/'
    registryUsername='root'
    registryPassword='null'
    registryEmail='null'
    dockerConfig='DefaultDockerClientConfig[dockerHost=unix:///var/run/docker.sock,registryUsername=root,registryPassword=<null>,registryEmail=<null>,registryUrl=https://index.docker.io/v1/,dockerConfigPath=/root/.docker,sslConfig=<null>,apiVersion={UNKNOWN_VERSION},dockerConfig=<null>]'

[main] WARN org.testcontainers.utility.RegistryAuthLocator - Failure when attempting to lookup auth config (dockerImageName: alpine:3.5, configFile: /root/.docker/config.json. Falling back to docker-java default behaviour. Exception message: /root/.docker/config.json (No such file or directory)

This was running for 14 hours without any response.

It seems that the alpine container it's stopping right after starting, logging only the ip (which is in the same subnet as the worker).

Docker info:

Containers: 16
 Running: 15
 Paused: 0
 Stopped: 1
Images: 16
Server Version: 18.09.6
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: journald
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
init version: fec3683
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.9.0-9-amd64
Operating System: Debian GNU/Linux 9 (stretch)
OSType: linux
Architecture: x86_64
CPUs: 72
Total Memory: 250.6GiB
Name: mighty
ID: IBKW:IZW5:C26P:EYCR:7MEL:FZEO:NT75:NFWE:SU3V:XAF4:VRDT:YL6T
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine

WARNING: No swap limit support
bsideup commented 5 years ago

@radistao and others:

I think I figured why it stucks while working on #1886. In some cases, OkHttp will attempt to read fully from the socket to discard bytes, and, in case of the Unix socket, it hangs forever. I applied a hack that detects that we're closing the response and immediately returns. It helped in that PR, and the symptoms looked very similar to what you observed.

Once PR is merged, it would be great if you can try the latest master build (via Jitpack) and verify it

radistao commented 5 years ago

@bsideup Unfortunately i don't work on that environment any more and can't test the hack.

radistao commented 4 years ago

Not sure this is really a case, but maybe related, so if someone stumble upon this issue again:

try to decrease MTU in the docker/k8s networking configuration:

Docker MTU issues and solutions How we spent a full day figuring out a MTU issue with docker