testcontainers / testcontainers-java

Testcontainers is a Java library that supports JUnit tests, providing lightweight, throwaway instances of common databases, Selenium web browsers, or anything else that can run in a Docker container.
https://testcontainers.org
MIT License
7.97k stars 1.64k forks source link

Test containers stale forever with docker in docker #1473

Closed ABeltramo closed 4 years ago

ABeltramo commented 5 years ago

Running testcontainers v1.11.2 on a TeamCity worker over docker. It was working until we moved everything to a new machine (we also changed OS from Ubuntu to Debian). Now it run forever without logging anything:

Running TestDockerDBs
[main] INFO org.testcontainers.dockerclient.DockerClientProviderStrategy - Loaded org.testcontainers.dockerclient.EnvironmentAndSystemPropertyClientProviderStrategy from ~/.testcontainers.properties, will try it first
[main] INFO org.testcontainers.dockerclient.DockerClientProviderStrategy - Will use 'okhttp' transport
[main] INFO org.testcontainers.dockerclient.EnvironmentAndSystemPropertyClientProviderStrategy - Found docker client settings from environment
[main] INFO org.testcontainers.dockerclient.DockerClientProviderStrategy - Found Docker environment with Environment variables, system properties and defaults. Resolved:
    dockerHost=unix:///var/run/docker.sock
    apiVersion='{UNKNOWN_VERSION}'
    registryUrl='https://index.docker.io/v1/'
    registryUsername='root'
    registryPassword='null'
    registryEmail='null'
    dockerConfig='DefaultDockerClientConfig[dockerHost=unix:///var/run/docker.sock,registryUsername=root,registryPassword=<null>,registryEmail=<null>,registryUrl=https://index.docker.io/v1/,dockerConfigPath=/root/.docker,sslConfig=<null>,apiVersion={UNKNOWN_VERSION},dockerConfig=<null>]'

[main] WARN org.testcontainers.utility.RegistryAuthLocator - Failure when attempting to lookup auth config (dockerImageName: alpine:3.5, configFile: /root/.docker/config.json. Falling back to docker-java default behaviour. Exception message: /root/.docker/config.json (No such file or directory)

This was running for 14 hours without any response.

It seems that the alpine container it's stopping right after starting, logging only the ip (which is in the same subnet as the worker).

Docker info:

Containers: 16
 Running: 15
 Paused: 0
 Stopped: 1
Images: 16
Server Version: 18.09.6
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: journald
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
init version: fec3683
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.9.0-9-amd64
Operating System: Debian GNU/Linux 9 (stretch)
OSType: linux
Architecture: x86_64
CPUs: 72
Total Memory: 250.6GiB
Name: mighty
ID: IBKW:IZW5:C26P:EYCR:7MEL:FZEO:NT75:NFWE:SU3V:XAF4:VRDT:YL6T
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine

WARNING: No swap limit support
ABeltramo commented 5 years ago

Edit: enabled debug level for slf4j:

[main] DEBUG org.testcontainers.utility.TestcontainersConfiguration - Testcontainers configuration overrides will be loaded from file:/root/.testcontainers.properties
[main] DEBUG org.testcontainers.utility.TestcontainersConfiguration - Testcontainers configuration overrides loaded from TestcontainersConfiguration(properties={docker.client.strategy=org.testcontainers.dockerclient.EnvironmentAndSystemPropertyClientProviderStrategy})
[main] INFO org.testcontainers.dockerclient.DockerClientProviderStrategy - Loaded org.testcontainers.dockerclient.EnvironmentAndSystemPropertyClientProviderStrategy from ~/.testcontainers.properties, will try it first
[main] INFO org.testcontainers.dockerclient.DockerClientProviderStrategy - Will use 'okhttp' transport
[ducttape-0] DEBUG org.testcontainers.dockerclient.DockerClientProviderStrategy - Pinging docker daemon...
[ducttape-0] DEBUG com.github.dockerjava.core.command.AbstrDockerCmd - Cmd: org.testcontainers.dockerclient.transport.okhttp.OkHttpDockerCmdExecFactory$1@7fa648c2
[main] INFO org.testcontainers.dockerclient.EnvironmentAndSystemPropertyClientProviderStrategy - Found docker client settings from environment
[main] INFO org.testcontainers.dockerclient.DockerClientProviderStrategy - Found Docker environment with Environment variables, system properties and defaults. Resolved:
    dockerHost=unix:///var/run/docker.sock
    apiVersion='{UNKNOWN_VERSION}'
    registryUrl='https://index.docker.io/v1/'
    registryUsername='root'
    registryPassword='null'
    registryEmail='null'
    dockerConfig='DefaultDockerClientConfig[dockerHost=unix:///var/run/docker.sock,registryUsername=root,registryPassword=<null>,registryEmail=<null>,registryUrl=https://index.docker.io/v1/,dockerConfigPath=/root/.docker,sslConfig=<null>,apiVersion={UNKNOWN_VERSION},dockerConfig=<null>]'

[main] DEBUG com.github.dockerjava.core.command.AbstrDockerCmd - Cmd: ListImagesCmdImpl[imageNameFilter=alpine:3.5,showAll=false,filters=com.github.dockerjava.core.util.FiltersBuilder@0,execution=com.github.dockerjava.core.exec.ListImagesCmdExec@3f4faf53]
[main] DEBUG org.testcontainers.utility.RegistryAuthLocator - Looking up auth config for image: alpine:3.5
[main] DEBUG org.testcontainers.utility.RegistryAuthLocator - RegistryAuthLocator has configFile: /root/.docker/config.json (does not exist) and commandPathPrefix:
[main] WARN org.testcontainers.utility.RegistryAuthLocator - Failure when attempting to lookup auth config (dockerImageName: alpine:3.5, configFile: /root/.docker/config.json. Falling back to docker-java default behaviour. Exception message: /root/.docker/config.json (No such file or directory)
[main] DEBUG org.testcontainers.dockerclient.auth.AuthDelegatingDockerClientConfig - Effective auth config [null]
[main] DEBUG com.github.dockerjava.core.command.AbstrDockerCmd - Cmd: com.github.dockerjava.core.command.CreateContainerCmdImpl@30c8681[name=<null>,hostName=<null>,domainName=<null>,user=<null>,attachStdin=<null>,attachStdout=<null>,attachStderr=<null>,portSpecs=<null>,tty=<null>,stdinOpen=<null>,stdInOnce=<null>,env=<null>,cmd={sh,-c,ip route|awk '/default/ { print $3 }'},entrypoint=<null>,image=alpine:3.5,volumes=com.github.dockerjava.api.model.Volumes@78aa1f72,workingDir=<null>,macAddress=<null>,networkDisabled=<null>,exposedPorts=com.github.dockerjava.api.model.ExposedPorts@1f75a668,stopSignal=<null>,hostConfig=com.github.dockerjava.api.model.HostConfig@35399441[binds=<null>,blkioWeight=<null>,blkioWeightDevice=<null>,blkioDeviceReadBps=<null>,blkioDeviceReadIOps=<null>,blkioDeviceWriteBps=<null>,blkioDeviceWriteIOps=<null>,memorySwappiness=<null>,capAdd=<null>,capDrop=<null>,containerIDFile=<null>,cpuPeriod=<null>,cpuShares=<null>,cpuQuota=<null>,cpusetCpus=<null>,cpusetMems=<null>,devices=<null>,diskQuota=<null>,dns=<null>,dnsSearch=<null>,extraHosts=<null>,links=<null>,logConfig=<null>,lxcConf=<null>,memory=<null>,memorySwap=<null>,memoryReservation=<null>,kernelMemory=<null>,networkMode=<null>,oomKillDisable=<null>,autoRemove=<null>,oomScoreAdj=<null>,portBindings=<null>,privileged=<null>,publishAllPorts=<null>,readonlyRootfs=<null>,restartPolicy=<null>,ulimits=<null>,volumesFrom=<null>,pidMode=<null>,securityOpts=<null>,cgroupParent=<null>,volumeDriver=<null>,shmSize=<null>,pidsLimit=<null>,runtime=<null>,tmpFs=<null>],labels={org.testcontainers=true, org.testcontainers.sessionId=5be1ec99-c430-4a4a-9471-918e13de1d12},networkingConfig=<null>,ipv4Address=<null>,ipv6Address=<null>,aliases=<null>,authConfig=<null>,execution=com.github.dockerjava.core.exec.CreateContainerCmdExec@662706a7]
[main] DEBUG com.github.dockerjava.core.command.AbstrDockerCmd - Cmd: 7e5b95ae497aa7fa3e73e9c9ef9c59a30c7497c5e821e6489f8fc91fe31cdb2e,com.github.dockerjava.core.exec.StartContainerCmdExec@27216cd
[tc-okhttp-stream-238564722] DEBUG com.github.dockerjava.core.command.LogContainerResultCallback - STDOUT: 172.17.0.1
gerloh commented 5 years ago

I have exactly the same within a alpine+maven docker container
We have Jenkins agent running on Red Hat which spans this docker container for the maven build. @ABeltramo Do you know which base image this Teamcity worker is using ?

ABeltramo commented 5 years ago

@gerloh They are using Ubuntu as a base image (Dockerfile). The alpine container I'm referring to is the one that testcontainers is running in order to get the internal docker IP.

rnorth commented 5 years ago

I wonder if there is an issue interpreting the log frames that are coming back from the docker container.

Which version of Java is this running on?

There's not much to go on, but I wonder what we can find that's common between your environments.

ABeltramo commented 5 years ago

Java version is:

Picked up JAVA_TOOL_OPTIONS: -XX:+UnlockExperimentalVMOptions
openjdk 11.0.2 2019-01-15
OpenJDK Runtime Environment (build 11.0.2+9-Ubuntu-3ubuntu118.04.3)
OpenJDK 64-Bit Server VM (build 11.0.2+9-Ubuntu-3ubuntu118.04.3, mixed mode, sharing)

@rnorth let me know if there is anything that I can do to further debug this

gerloh commented 5 years ago

We use the zulu-openjdk-alpine:8u181-8.31.0.1 docker image with maven added. I also tried the centos variant ( because RHEL is also centos) , but with the same result .

rnorth commented 5 years ago

Hmm. The title of the ticket mentions 'docker in docker' - is that accurate? Or are you both connecting to a docker daemon socket on the host machine?

ABeltramo commented 5 years ago

I just realise the difference, I'm mounting the docker host socket inside the container (-v /var/run/docker.sock:/var/run/docker.sock), sorry about the confusion. Feel free to edit the title.

gerloh commented 5 years ago

I am also using the socket

jannis-baratheon commented 5 years ago

Hi. I came here from #1602. Did anyone manage to trace this? Maybe some reliable reproduction scenario?

jannis-baratheon commented 5 years ago

@ABeltramo @gerloh Did you manage to solve this issue?

ABeltramo commented 5 years ago

Unfortunately I stopped using testcontainers because of this issue. No solution was found on my end.

jannis-baratheon commented 5 years ago

Ouch :) Seems like a high priority issue.

ABeltramo commented 5 years ago

Doesn't seems so, I tried reaching out https://testcontainers.slack.com/ asking about possible ways to better debug this but nobody even responded to me, I guess they just don't care.

bsideup commented 5 years ago

We have a lot of questions in many places (including Slack) and may miss things. It does not mean that we don't care, just the project is maintained by a few folks that are working on it on their own time, and do a "best effort" maintaining.

Sorry that the question was not answered and you stopped using the library.

1602 seems to have a lot of info about the problem, and really nice debugging attempts.

It is still hard to say why it fails, but at least, thanks to @radistao, we know how it fails and where.

The deadlock looks more like a bug in OkHttp, we will need to work with them to figure out why it is happening. Their migration to Kotlin in version 4 will most probably make it impossible for us to update to the fixed version, but at least maybe we will get a workaround...

jannis-baratheon commented 5 years ago

@bsideup Will you open a ticket for this in their BT?

bsideup commented 5 years ago

@jannis-baratheon once we confirm that this is not on our side - yes

jannis-baratheon commented 5 years ago

I wonder how are TeamCity and Jenkins related in the way they spawn docker-container agents. Maybe they use the same common library for this? Because in Travis it seems to be working fine, as you mentioned in #1602

bsideup commented 5 years ago

I don't think there is anything about the way container is started. Also, @ABeltramo's problem started appearing after moving the executors to another machine (Different Docker? Distributive? Kernel? Firewall?...)

radistao commented 5 years ago

As i mentioned in https://github.com/testcontainers/testcontainers-java/issues/1602#issuecomment-511382051 - it seems the problem not i docker itself, but Unix socket library, because if i run the same container using ProcessBuilder (https://github.com/testcontainers/testcontainers-java/issues/1602#issuecomment-511380963) - it works fine and finishes successfully.

jannis-baratheon commented 5 years ago

It would be great if the guys experiencing this (@ABeltramo @gerloh @radistao) described their CI setup as accurate as possible. Distro, distro version, Jenkins/TeamCity version + config, docker version, network configuration etc. anything that would help reproducing the problem.

Thanks.

bsideup commented 5 years ago

Somehow similar issue, mentions Unix sockets and OkHttp (but FTR not the library we're using): https://github.com/jnr/jnr-unixsocket/issues/60

I will try to recreate the provided test case with SBT's Unix Socket implementation that we're using

@radistao if you have time, you could try replacing UnixDomainSocket with some other implementation (e.g. with JNR UnixSocket): https://github.com/testcontainers/testcontainers-java/blob/d127fd799bccbb4ee4d006dc2edd0f56c0c908c2/core/src/main/java/org/testcontainers/dockerclient/transport/okhttp/UnixSocketFactory.java#L27

jannis-baratheon commented 5 years ago

About the configuration @radistao shared a bit in https://github.com/testcontainers/testcontainers-java/issues/1602#issuecomment-512845353. From now on let's try to keep the discussion here, in one place.

ABeltramo commented 5 years ago

Posting all the informations about our server again in a single message, if something is missing just let me know.

Docker info:

Containers: 16
 Running: 16
 Paused: 0
 Stopped: 0
Images: 13
Server Version: 18.09.6
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: journald
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
init version: fec3683
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.9.0-9-amd64
Operating System: Debian GNU/Linux 9 (stretch)
OSType: linux
Architecture: x86_64
CPUs: 72
Total Memory: 250.6GiB
Name: mighty
ID: IBKW:IZW5:C26P:EYCR:7MEL:FZEO:NT75:NFWE:SU3V:XAF4:VRDT:YL6T
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine

WARNING: No swap limit support
uname -a
Linux mighty 4.9.0-9-amd64 #1 SMP Debian 4.9.168-1 (2019-04-12) x86_64 GNU/Linux

cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 9 (stretch)"
NAME="Debian GNU/Linux"
VERSION_ID="9"
VERSION="9 (stretch)"
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

Java version:

Picked up JAVA_TOOL_OPTIONS: -XX:+UnlockExperimentalVMOptions
openjdk 11.0.2 2019-01-15
OpenJDK Runtime Environment (build 11.0.2+9-Ubuntu-3ubuntu118.04.3)
OpenJDK 64-Bit Server VM (build 11.0.2+9-Ubuntu-3ubuntu118.04.3, mixed mode, sharing)

TeamCity version 2019.1.1.

All the docker containers run on the same machine and inside the same docker network interface. As I already said the alpine container is able to correctly fetch the local docker ip, log it and then stale forever so I suppose it's not a network issue.

bsideup commented 5 years ago

I just pushed a change that may affect it: https://github.com/testcontainers/testcontainers-java/tree/close_stream_after_shutdown

Can some of you try running this on your environment?

git clone https://github.com/testcontainers/testcontainers-java.git
cd testcontainers-java
git checkout close_stream_after_shutdown
./gradlew :testcontainers:test

?

ABeltramo commented 5 years ago

@bsideup I tried and seems to be stuck during the test, I manually stopped after 3 minutes.

I may have found something: the Alpine container is logging the default Docker bridge network subnet (172.17.0.1) while TeamCity and the workers are on a separate Docker network (172.19.0.2) can this be the issue here?
Can someone try to spin up a docker container in a network different than bridge and see if from there test-containers will still running?

Full output:

Downloading https://services.gradle.org/distributions/gradle-4.10.2-all.zip
...............................................................................................................

Welcome to Gradle 4.10.2!

Here are the highlights of this release:
 - Incremental Java compilation by default
 - Periodic Gradle caches cleanup
 - Gradle Kotlin DSL 1.0-RC6
 - Nested included builds
 - SNAPSHOT plugin versions in the `plugins {}` block

For more details see https://docs.gradle.org/4.10.2/release-notes.html

Starting a Gradle Daemon (subsequent builds will be faster)

> Task :testcontainers:compileJava
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

> Task :testcontainers:compileTestJava
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

> Task :testcontainers:test
Picked up JAVA_TOOL_OPTIONS: -XX:+UnlockExperimentalVMOptions

Gradle Test Executor 1 > com.github.dockerjava.core.util.FiltersEncoderTest > filtersShouldWork STARTED
<===========--> 85% EXECUTING [2m 58s]
> IDLE
> :testcontainers:test > Executing test com.github.dockerjava.core.util.FiltersEncoderTest
> IDLE
> IDLE
> IDLE
> IDLE
> IDLE
> IDLE
> IDLE
> :testcontainers:test > 0 tests completed
> IDLE
> IDLE
> IDLE
> IDLE
> IDLE
> IDLE
> IDLE
> IDLE
> IDLE
> IDLE
> IDLE
> IDLE
> IDLE
> IDLE
> IDLE
jannis-baratheon commented 5 years ago

It seems that both @radistao and @ABeltramo have their agents running on Debian 9. A possible clue?

bsideup commented 5 years ago

@jannis-baratheon maybe. Would be nice if somebody can verify it with Vagrant and a Debian-based VM, to isolate the problem

radistao commented 5 years ago

Sorry, i was off. This is an info about my env: Docker versions are 19.03.0 for client and 18.09.3 for server. Gradle version 5.5 JDK version 11.0.3 (Azul Systems, Inc. 11.0.3+7-LTS) OS (container): Debian 9 (stretch)

Jenkins version and host OS i can't fetch, unfortunately.

radistao commented 5 years ago

@bsideup

I just pushed a change that may affect it: https://github.com/testcontainers/testcontainers-java/tree/close_stream_after_shutdown Can some of you try running this on your environment?

Tried (explicit stream closing) - still blocks.

bsideup commented 5 years ago

Okay, I see the "Debian 9" pattern here as pointed out in https://github.com/testcontainers/testcontainers-java/issues/1473#issuecomment-513460735

We're busy with the release, but if someone can reproduce it with Vagrant, that would definitely help

radistao commented 5 years ago

unfortunately i can't run it in Vagrant. If it helps - i tried to use oraclelinux build docker image - the result is absolutely the same.

jannis-baratheon commented 5 years ago

I was talking about the host system (the system that runs on your Jenkins agent). I thought you wrote that it's Debian Stretch. Can you check if it's true in case this was only my imagination.

radistao commented 5 years ago

unfortunately i can't check which host system is running. Very likely it is Debian, but not sure and can't verify it.

radistao commented 5 years ago

you could try replacing UnixDomainSocket with some other implementation (e.g. with JNR UnixSocket):

tried jnr.unixsocket.UnixSocket from com.github.jnr:jnr-unixsocket:0.23 - now goes to infinite loop with exception:

[testcontainers-ryuk] WARN o.t.u.ResourceReaper - Can not connect to Ryuk at localhost:32805 java.net.ConnectException: Connection refused (Connection refused) at java.base/java.net.PlainSocketImpl.socketConnect(Native Method) at java.base/java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:399) at java.base/java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:242) at java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:224) at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:403) at java.base/java.net.Socket.connect(Socket.java:591) at java.base/java.net.Socket.connect(Socket.java:540) at java.base/java.net.Socket.(Socket.java:436) at java.base/java.net.Socket.(Socket.java:213) at org.testcontainers.utility.ResourceReaper.lambda$start$1(ResourceReaper.java:112) at java.base/java.lang.Thread.run(Thread.java:834)

jannis-baratheon commented 5 years ago

Ooooh that's an interesting one.

bsideup commented 5 years ago

maybe related to https://github.com/testcontainers/testcontainers-java/issues/1277#issuecomment-468306052

jannis-baratheon commented 5 years ago

@bsideup Do you think trying that with Ryuk disabled to confirm would be a good idea? OFC if @radistao is ok with leaving the containers hanging.

radistao commented 5 years ago

i tried to disable Ryuk (TESTCONTAINERS_RYUK_DISABLED env variable) at the beginning of the problem investigation - this didn't help

radistao commented 5 years ago

could the problem be related to https://github.com/testcontainers/testcontainers-java/issues/1014 ?

samcarpentier commented 5 years ago

I'm having the same issue. I just spoke with @bsideup on TC's Slack about that. Here are some information about my systems to help with the investigation.

Docker version: 18.09.6, build 481bc77156 Build system: GitLab CI GitLab runner version: 12.0.2, build d0b76032 OS (container): maven:3.3-jdk-8 --> openjdk:8-jdk --> debian:buster-slim OS (host): CentOS 7 3.10.0-957.21.2.el7.x86_64 GNU/Linux JDK version: 1.8.0_181 Maven version: 3.5.4

Log output of a stalled build (with TRACE logging enabled)

[main] DEBUG o.t.u.TestcontainersConfiguration - Testcontainers configuration overrides will be loaded from file:/builds/java-commons/kafka/kafka-test-docker/target/classes/testcontainers.properties 
[main] DEBUG o.t.u.TestcontainersConfiguration - Testcontainers configuration overrides will be loaded from file:/builds/java-commons/kafka/kafka-test-docker/target/classes/testcontainers.properties 
[main] DEBUG o.t.u.TestcontainersConfiguration - Testcontainers configuration overrides will be loaded from file:/root/.testcontainers.properties 
[main] TRACE o.t.u.TestcontainersConfiguration - Testcontainers config override was found on file:/root/.testcontainers.properties but the file was not found 
java.io.FileNotFoundException: /root/.testcontainers.properties (No such file or directory)
    at java.io.FileInputStream.open0(Native Method)
    at java.io.FileInputStream.open(FileInputStream.java:195)
    at java.io.FileInputStream.<init>(FileInputStream.java:138)
    at java.io.FileInputStream.<init>(FileInputStream.java:93)
    at sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90)
    at sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:188)
    at java.net.URL.openStream(URL.java:1045)
    at org.testcontainers.utility.TestcontainersConfiguration.lambda$loadConfiguration$0(TestcontainersConfiguration.java:127)
    at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
    at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
    at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
    at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:474)
    at org.testcontainers.utility.TestcontainersConfiguration.loadConfiguration(TestcontainersConfiguration.java:136)
    at org.testcontainers.utility.TestcontainersConfiguration.getInstance(TestcontainersConfiguration.java:24)
    at org.testcontainers.containers.SocatContainer.<init>(SocatContainer.java:19)
    at org.testcontainers.containers.DockerComposeContainer.<init>(DockerComposeContainer.java:73)
    at org.testcontainers.containers.DockerComposeContainer.<init>(DockerComposeContainer.java:96)
    at org.testcontainers.containers.DockerComposeContainer.<init>(DockerComposeContainer.java:92)
    at com.bell.cts.commons.kafka.docker.DockerKafkaCluster.<init>(DockerKafkaCluster.java:38)
    at com.bell.cts.commons.kafka.docker.BootstrappedDockerKafkaCluster.<clinit>(BootstrappedDockerKafkaCluster.java:10)
    at com.bell.cts.commons.kafka.docker.DockerKafkaTest.<clinit>(DockerKafkaTest.java:19)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
    at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
    at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369)
    at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275)
    at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:239)
    at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:160)
    at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:373)
    at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:334)
    at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:119)
    at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:407)
[main] DEBUG o.t.u.TestcontainersConfiguration - Testcontainers configuration overrides loaded from TestcontainersConfiguration(properties={ryuk.container.privileged=true, checks.disable=true}) 
[main] INFO  o.t.d.DockerClientProviderStrategy - Will use 'okhttp' transport 
[ducttape-0] DEBUG o.t.d.DockerClientProviderStrategy - Pinging docker daemon... 
[ducttape-0] DEBUG c.g.d.core.command.AbstrDockerCmd - Cmd: org.testcontainers.dockerclient.transport.okhttp.OkHttpDockerCmdExecFactory$1@304a9859 
[main] INFO  o.t.d.EnvironmentAndSystemPropertyClientProviderStrategy - Found docker client settings from environment 
[main] INFO  o.t.d.DockerClientProviderStrategy - Found Docker environment with Environment variables, system properties and defaults. Resolved: 
    dockerHost=unix:///var/run/docker.sock
    apiVersion='{UNKNOWN_VERSION}'
    registryUrl='https://index.docker.io/v1/'
    registryUsername='root'
    registryPassword='null'
    registryEmail='null'
    dockerConfig='DefaultDockerClientConfig[dockerHost=unix:///var/run/docker.sock,registryUsername=root,registryPassword=<null>,registryEmail=<null>,registryUrl=https://index.docker.io/v1/,dockerConfigPath=/root/.docker,sslConfig=<null>,apiVersion={UNKNOWN_VERSION},dockerConfig=<null>]'

[main] DEBUG c.g.d.core.command.AbstrDockerCmd - Cmd: ListImagesCmdImpl[imageNameFilter=alpine:3.5,showAll=false,filters=com.github.dockerjava.core.util.FiltersBuilder@0,execution=com.github.dockerjava.core.exec.ListImagesCmdExec@3380ca3d] 
[main] TRACE c.g.d.core.exec.ListImagesCmdExec - GET: OkHttpWebTarget(okHttpClient=org.testcontainers.shaded.okhttp3.OkHttpClient@7aaad0, baseUrl=http://docker.socket/, path=[/images/json], queryParams={filter=[alpine:3.5]}) 
[main] TRACE c.g.d.core.exec.ListImagesCmdExec - Response: [com.github.dockerjava.api.model.Image@7f1f60a0[created=1548886840,id=sha256:f80194ae2e0ccf0f098baa6b981396dfbffb16e6476164af72158577a7de2dd9,parentId=,repoTags={alpine:3.5},size=3995534,virtualSize=3995534]] 
[main] DEBUG o.t.utility.RegistryAuthLocator - Looking up auth config for image: alpine:3.5 
[main] DEBUG o.t.utility.RegistryAuthLocator - RegistryAuthLocator has configFile: /root/.docker/config.json (does not exist) and commandPathPrefix:  
[main] WARN  o.t.utility.RegistryAuthLocator - Failure when attempting to lookup auth config (dockerImageName: alpine:3.5, configFile: /root/.docker/config.json. Falling back to docker-java default behaviour. Exception message: /root/.docker/config.json (No such file or directory) 
[main] DEBUG o.t.d.a.AuthDelegatingDockerClientConfig - Effective auth config [null] 
[main] DEBUG c.g.d.core.command.AbstrDockerCmd - Cmd: com.github.dockerjava.core.command.CreateContainerCmdImpl@19dac2d6[name=<null>,hostName=<null>,domainName=<null>,user=<null>,attachStdin=<null>,attachStdout=<null>,attachStderr=<null>,portSpecs=<null>,tty=<null>,stdinOpen=<null>,stdInOnce=<null>,env=<null>,cmd={sh,-c,ip route|awk '/default/ { print $3 }'},entrypoint=<null>,image=alpine:3.5,volumes=com.github.dockerjava.api.model.Volumes@39ace1a7,workingDir=<null>,macAddress=<null>,networkDisabled=<null>,exposedPorts=com.github.dockerjava.api.model.ExposedPorts@1930a804,stopSignal=<null>,hostConfig=com.github.dockerjava.api.model.HostConfig@bd4ee01[binds=<null>,blkioWeight=<null>,blkioWeightDevice=<null>,blkioDeviceReadBps=<null>,blkioDeviceReadIOps=<null>,blkioDeviceWriteBps=<null>,blkioDeviceWriteIOps=<null>,memorySwappiness=<null>,capAdd=<null>,capDrop=<null>,containerIDFile=<null>,cpuPeriod=<null>,cpuShares=<null>,cpuQuota=<null>,cpusetCpus=<null>,cpusetMems=<null>,devices=<null>,diskQuota=<null>,dns=<null>,dnsSearch=<null>,extraHosts=<null>,links=<null>,logConfig=<null>,lxcConf=<null>,memory=<null>,memorySwap=<null>,memoryReservation=<null>,kernelMemory=<null>,networkMode=<null>,oomKillDisable=<null>,autoRemove=<null>,oomScoreAdj=<null>,portBindings=<null>,privileged=<null>,publishAllPorts=<null>,readonlyRootfs=<null>,restartPolicy=<null>,ulimits=<null>,volumesFrom=<null>,pidMode=<null>,securityOpts=<null>,cgroupParent=<null>,volumeDriver=<null>,shmSize=<null>,pidsLimit=<null>,runtime=<null>,tmpFs=<null>],labels={org.testcontainers=true, org.testcontainers.sessionId=bd044437-0543-43c8-9ba6-979185431a6c},networkingConfig=<null>,ipv4Address=<null>,ipv6Address=<null>,aliases=<null>,authConfig=<null>,execution=com.github.dockerjava.core.exec.CreateContainerCmdExec@2a19a0fe] 
[main] TRACE c.g.d.c.exec.CreateContainerCmdExec - POST: OkHttpWebTarget(okHttpClient=org.testcontainers.shaded.okhttp3.OkHttpClient@7aaad0, baseUrl=http://docker.socket/, path=[/containers/create], queryParams={})  
[main] DEBUG c.g.d.core.command.AbstrDockerCmd - Cmd: 81f1c19753f4052f4b2d2122d2c3ab4928b1c4bd10ecfddb872126434f9270e2,com.github.dockerjava.core.exec.StartContainerCmdExec@44aed6a4 
[main] TRACE c.g.d.c.exec.StartContainerCmdExec - POST: OkHttpWebTarget(okHttpClient=org.testcontainers.shaded.okhttp3.OkHttpClient@7aaad0, baseUrl=http://docker.socket/, path=[/containers/81f1c19753f4052f4b2d2122d2c3ab4928b1c4bd10ecfddb872126434f9270e2/start], queryParams={}) 
[main] TRACE o.t.utility.AuditLogger - START action with image: null, containerId: 81f1c19753f4052f4b2d2122d2c3ab4928b1c4bd10ecfddb872126434f9270e2 
[main] TRACE c.g.d.core.exec.LogContainerCmdExec - GET: OkHttpWebTarget(okHttpClient=org.testcontainers.shaded.okhttp3.OkHttpClient@7aaad0, baseUrl=http://docker.socket/, path=[/containers/81f1c19753f4052f4b2d2122d2c3ab4928b1c4bd10ecfddb872126434f9270e2/logs], queryParams={stdout=[true], follow=[true]}) 
[tc-okhttp-stream-1275269089] DEBUG c.g.d.c.c.LogContainerResultCallback - STDOUT: 172.17.0.1 

Here is the function where TC gets stuck:

2019-07-26 20:38:39
Full thread dump OpenJDK 64-Bit Server VM (25.181-b13 mixed mode):
"tc-okhttp-stream-236275784" #15 daemon prio=5 os_prio=0 tid=0x00007feef8bf9000 nid=0x73 runnable [0x00007feeb4ae3000]
   java.lang.Thread.State: RUNNABLE
    at org.testcontainers.shaded.org.scalasbt.ipcsocket.UnixDomainSocketLibrary.read(Native Method)
    at org.testcontainers.shaded.org.scalasbt.ipcsocket.UnixDomainSocket$UnixDomainSocketInputStream.doRead(UnixDomainSocket.java:149)
    at org.testcontainers.shaded.org.scalasbt.ipcsocket.UnixDomainSocket$UnixDomainSocketInputStream.read(UnixDomainSocket.java:136)
    at java.io.FilterInputStream.read(FilterInputStream.java:133)
    at org.testcontainers.shaded.okio.Okio$2.read(Okio.java:140)
    at org.testcontainers.shaded.okio.AsyncTimeout$2.read(AsyncTimeout.java:237)
    at org.testcontainers.shaded.okio.RealBufferedSource.request(RealBufferedSource.java:68)
    at org.testcontainers.shaded.okio.RealBufferedSource.require(RealBufferedSource.java:61)
    at org.testcontainers.shaded.okio.RealBufferedSource.readHexadecimalUnsignedLong(RealBufferedSource.java:303)
    at org.testcontainers.shaded.okhttp3.internal.http1.Http1Codec$ChunkedSource.readChunkSize(Http1Codec.java:469)
    at org.testcontainers.shaded.okhttp3.internal.http1.Http1Codec$ChunkedSource.read(Http1Codec.java:449)
    at org.testcontainers.shaded.okio.RealBufferedSource.exhausted(RealBufferedSource.java:57)
    at org.testcontainers.dockerclient.transport.okhttp.OkHttpInvocationBuilder$FramedSink.accept(OkHttpInvocationBuilder.java:345)
    at org.testcontainers.dockerclient.transport.okhttp.OkHttpInvocationBuilder$FramedSink.accept(OkHttpInvocationBuilder.java:334)
    at org.testcontainers.dockerclient.transport.okhttp.OkHttpInvocationBuilder.lambda$executeAndStream$1(OkHttpInvocationBuilder.java:296)
    at org.testcontainers.dockerclient.transport.okhttp.OkHttpInvocationBuilder$$Lambda$37/2045895214.run(Unknown Source)
    at java.lang.Thread.run(Thread.java:748)
"OkHttp ConnectionPool" #13 daemon prio=5 os_prio=0 tid=0x00007fee94182000 nid=0x71 in Object.wait() [0x00007feec41de000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00000006c7ce95a8> (a org.testcontainers.shaded.okhttp3.ConnectionPool)
    at java.lang.Object.wait(Object.java:460)
    at org.testcontainers.shaded.okhttp3.ConnectionPool$1.run(ConnectionPool.java:67)
    - locked <0x00000006c7ce95a8> (a org.testcontainers.shaded.okhttp3.ConnectionPool)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

Docker system information:

Containers: 1
 Running: 1
 Paused: 0
 Stopped: 0
Images: 1
Server Version: 19.03.1
Storage Driver: overlay2
 Backing Filesystem: xfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: journald
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host ipvlan macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
init version: fec3683
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-957.21.2.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 6
Total Memory: 15.51GiB
Name: glr1.mtl42.vlab.f0nz.local
ID: F7FV:4SIG:DRGK:BBQS:V3EC:JIVV:3Y33:6DWD:VUA5:SNRB:VVYE:DYSX
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 41
 Goroutines: 59
 System Time: 2019-07-29T09:24:48.735455022-04:00
 EventsListeners: 0
HTTP Proxy: http://proxy:3128
HTTPS Proxy: http://proxy:3128
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
radistao commented 5 years ago

Update from my side:

  1. I tried in Vagrant + debian/stretch64 + Jenkins (from https://pkg.jenkins.io/debian-stable) - the bug is not reproducible, the containers started and finished successfully.

  2. Our dev Jenkins (where build fails) is run from Docker image jenkins/jenkins:2.182 (however the very root host system i still don't know, it is somewhere in the "clouds"). I tried it in the same Vagrant config as above with this Jenkins docker - the build now fails with another error:

java.lang.reflect.UndeclaredThrowableException at com.sun.proxy.$Proxy26.exec(Unknown Source) at ort.testcontainers.did.DockerRunTest.testRunDockerInDocker(DockerRunTest.java:53) Caused by: java.lang.reflect.InvocationTargetException at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.testcontainers.dockerclient.AuditLoggingDockerClient.lambda$wrappedCommand$14(AuditLoggingDockerClient.java:98) ... 2 more Caused by: com.github.dockerjava.api.exception.InternalServerErrorException: {"message":"OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v1.linux/moby/3f5b775bec9b7d0e565908fcb9ef4041cbf77bf196a53c91e0b01ddf3fbf97af/log.json: no such file or directory): runc did not terminate sucessfully: unknown"} at org.testcontainers.dockerclient.transport.okhttp.OkHttpInvocationBuilder.execute(OkHttpInvocationBuilder.java:276) at org.testcontainers.dockerclient.transport.okhttp.OkHttpInvocationBuilder.execute(OkHttpInvocationBuilder.java:254) at org.testcontainers.dockerclient.transport.okhttp.OkHttpInvocationBuilder.post(OkHttpInvocationBuilder.java:115) at com.github.dockerjava.core.exec.StartContainerCmdExec.execute(StartContainerCmdExec.java:28) at com.github.dockerjava.core.exec.StartContainerCmdExec.execute(StartContainerCmdExec.java:11) at com.github.dockerjava.core.exec.AbstrSyncDockerCmdExec.exec(AbstrSyncDockerCmdExec.java:21) at com.github.dockerjava.core.command.AbstrDockerCmd.exec(AbstrDockerCmd.java:35) at com.github.dockerjava.core.command.StartContainerCmdImpl.exec(StartContainerCmdImpl.java:46)

https://github.com/radistao/testcontainers-did-bug/blob/8654a13a17d2df6349482cd345625fbb53e7e7db/src/test/java/ort/testcontainers/did/DockerRunTest.java#L50 (container is created, but not started, exception is caught by the catch block).

bsideup commented 5 years ago

@radistao the error you got sounds like a problem with your Docker and/or logging driver. Please note that Testcontainers reliers on the output from containers and the logging driver should support it (e.g. the json driver, or a few others, check the docs)

radistao commented 5 years ago

not sure how to check/configure this. Do you have some doc references? thank you.

bsideup commented 5 years ago

Sure: https://docs.docker.com/config/containers/logging/configure/

radistao commented 5 years ago
docker info --format '{{.LoggingDriver}}'
json-file

shall it be syslog?

bsideup commented 5 years ago

json-file should be okay. Since the error comes from Docker:

Caused by:
com.github.dockerjava.api.exception.InternalServerErrorException: {"message":"OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v1.linux/moby/3f5b775bec9b7d0e565908fcb9ef4041cbf77bf196a53c91e0b01ddf3fbf97af/log.json: no such file or directory): runc did not terminate sucessfully: unknown"}

I would suggest asking about it in the Docker/Moby community

samcarpentier commented 5 years ago

On my end, I waas using log driver journald and it was problematic. I just changed it to json-file and the issue is now resolved!

Here is a sample of my Docker daemon config:

{
  ...
  "log-driver": "json-file",
  "log-opts": {"max-size": "5m", "max-file": "1"},
  ...
}

Thanks for your help!

radistao commented 5 years ago

Minor update from my side: tried on Vagrant(debian/stretch64) + docker 19.03.1 + jenkins/jenkins image + BlueOcean plugin on this simple test project (which still fails in our working environment)

docker run -u root --privileged -ti -p 8080:8080 -v /usr/bin/docker:/usr/bin/docker -v /var/run/docker.sock:/var/run/docker.sock -v ${PWD}/jenkins_home:/var/jenkins_home:z -v /testcontainers-did/:/testcontainers-did --rm --name jenkins jenkins/jenkins

The project is built successfully with this configuration, so unfortunately i can't reproduce it in this virtual environment

vitalca commented 5 years ago

@samcarpentier, Thank you so much, your suggestion worked out!