testcontainers / testcontainers-java

Testcontainers is a Java library that supports JUnit tests, providing lightweight, throwaway instances of common databases, Selenium web browsers, or anything else that can run in a Docker container.
https://testcontainers.org
MIT License
8.03k stars 1.65k forks source link

[lima] Containers haven't been removed after tests finished #5101

Open Jasstkn opened 2 years ago

Jasstkn commented 2 years ago

Hi!

I'm trying to debug an issue with containers cleanup after tests finished.

CONTAINER ID   IMAGE                   COMMAND                  CREATED              STATUS              PORTS                                                                                         NAMES
b25b97adbc02   rabbitmq:3.8.9-alpine   "docker-entrypoint.s…"   About a minute ago   Up About a minute   4369/tcp, 5671/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:49156->5672/tcp, :::49156->5672/tcp   condescending_mayer

This problem reproduces for lima but doesn't reproduce for colima.

I found that there are some call to docker daemon:

Feb 21 10:36:42 lima-default dockerd[2618]: time="2022-02-21T10:36:42.187038150Z" level=debug msg="Calling GET /_ping"
Feb 21 10:37:15 lima-default dockerd[2618]: time="2022-02-21T10:37:15.151272519Z" level=debug msg="Calling GET /v1.29/containers/json?all=1&filters=%7B%22label%22%3A%7B%22org.testcontainers.sessionId%3D8935962b-acef-4214-8923-f10e0491eb50%22%3Atrue%2C%22org.testcontainers%3Dtrue%22%3Atrue%7D%7D&limit=0"
Feb 21 10:37:15 lima-default dockerd[2618]: time="2022-02-21T10:37:15.153219019Z" level=debug msg="Calling POST /v1.29/networks/prune?filters=%7B%22label%22%3A%7B%22org.testcontainers.sessionId%3D8935962b-acef-4214-8923-f10e0491eb50%22%3Atrue%2C%22org.testcontainers%3Dtrue%22%3Atrue%7D%7D"
Feb 21 10:37:15 lima-default dockerd[2618]: time="2022-02-21T10:37:15.154972968Z" level=debug msg="Calling POST /v1.29/volumes/prune?filters=%7B%22label%22%3A%7B%22org.testcontainers.sessionId%3D8935962b-acef-4214-8923-f10e0491eb50%22%3Atrue%2C%22org.testcontainers%3Dtrue%22%3Atrue%7D%7D"
Feb 21 10:37:15 lima-default dockerd[2618]: time="2022-02-21T10:37:15.155117411Z" level=debug msg=VolumeStore.Find ByType=service.andCombinator ByValue="[[local] false 0x55e010461980 0x55e010463080]"
Feb 21 10:37:15 lima-default dockerd[2618]: time="2022-02-21T10:37:15.155951293Z" level=debug msg="Calling POST /v1.29/images/prune?filters=%7B%22dangling%22%3A%7B%22false%22%3Atrue%7D%2C%22label%22%3A%7B%22org.testcontainers.sessionId%3D8935962b-acef-4214-8923-f10e0491eb50%22%3Atrue%2C%22org.testcontainers%3Dtrue%22%3Atrue%7D%7D"

And no errors in ryuk container just information that nothing has been deleted:

2022/02/21 10:39:25 Pinging Docker...
2022/02/21 10:39:25 Docker daemon is available!
2022/02/21 10:39:25 Starting on port 8080...
2022/02/21 10:39:25 Started!
2022/02/21 10:39:26 New client connected: 172.17.0.1:52086
2022/02/21 10:39:26 Received the first connection
2022/02/21 10:39:26 Adding {"label":{"org.testcontainers.sessionId=3caaa426-0f78-48d4-85d1-1d30c6314374":true,"org.testcontainers=true":true}}
2022/02/21 10:39:50 EOF
2022/02/21 10:39:50 Client disconnected: 172.17.0.1:52086
2022/02/21 10:40:00 Timed out waiting for re-connection
2022/02/21 10:40:00 Deleting {"label":{"org.testcontainers.sessionId=3caaa426-0f78-48d4-85d1-1d30c6314374":true,"org.testcontainers=true":true}}
2022/02/21 10:40:00 Removed 0 container(s), 0 network(s), 0 volume(s) 0 image(s)

Docker info:

Client:
 Context:    rootless
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Docker Buildx (Docker Inc., v0.7.1-docker)
  scan: Docker Scan (Docker Inc., v0.12.0)

Server:
 Containers: 16
  Running: 2
  Paused: 0
  Stopped: 14
 Images: 17
 Server Version: 20.10.12
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: false
  userxattr: true
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7b11cfaabd73bb80907dd23182b9347b4245eb5d
 runc version: v1.0.2-0-g52b36a2
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
  rootless
  cgroupns
 Kernel Version: 5.13.0-28-generic
 Operating System: Ubuntu 21.10
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 3.827GiB
 Name: lima-default
 ID: SNWC:ON44:TP6A:CBG5:JN5Q:32KS:5EKF:3DUJ:QY22:O74Q:SVIG:DRCQ
 Docker Root Dir: /home/username.linux/.local/share/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Any help is appreciated.

kiview commented 2 years ago

Hi @Jasstkn, neither lima nor colima are currently officially supported by Testcontainers.

A good start would be to try out if deletion works as expected if you interact directly with Ryuk (see instructions in README): https://github.com/testcontainers/moby-ryuk

Jasstkn commented 2 years ago

Hi @Jasstkn, neither lima nor colima are currently officially supported by Testcontainers.

A good start would be to try out if deletion works as expected if you interact directly with Ryuk (see instructions in README): https://github.com/testcontainers/moby-ryuk

it doesn't work either:

2022/02/21 12:22:59 Starting on port 8080...
2022/02/21 12:23:01 Connected
2022/02/21 12:23:13 Adding {"label":{"org.testcontainers=true":true}}
2022/02/21 12:23:21 EOF
2022/02/21 12:23:21 Disconnected
2022/02/21 12:23:31 Timed out waiting for connection
2022/02/21 12:23:31 Deleting {"label":{"org.testcontainers=true":true}}
2022/02/21 12:23:31 Removed 0 container(s), 0 network(s), 0 volume(s)

I know that those alternatives aren't officially supported but I think that you can have more information where the issue can be.

dnatic09 commented 2 years ago

Hi @Jasstkn, neither lima nor colima are currently officially supported by Testcontainers.

A good start would be to try out if deletion works as expected if you interact directly with Ryuk (see instructions in README): https://github.com/testcontainers/moby-ryuk

Is there an issue ticket or way we can promote this request? I just switched to colima and am having the normal issues as reported by others.

Are there features or known broken components that the open source community can work on? I would be happy to contribute. I just do not know if the issues are in TestContainers-Java or specifically in the Ryuk containers.

kiview commented 2 years ago

Thanks for trying this out @Jasstkn, this is very helpful. In this case, I'd say the issue might be unrelated to Testcontainers itself and has to be somewhere in the interaction between Ryuk and Lima.

@dnatic09 How does Colima not work for you? From how I understood @Jasstkn, it works fine for her with Colima and the problems are when using Lima directly. In the case of this issue, some more investigations with regards to Ryuk and Lima compatibility would be appreciated. It might be that Lima is lacking parts of a Docker compatibility layer that Colima provides, but I don't have any personal experience with either technology yet, so I can't comment on this further.

dnatic09 commented 2 years ago

I unfortunately do not have extensive time to troubleshoot, but TestContainers cannot connect to the Ryuk container when running unit tests.

If I run Ryuk manually and port forward, I can netcat the container. If Junit5 starts the Ryuk container, I am unable to netcat the container's exposed port.

dnatic09 commented 2 years ago

Okay. Did some testing:

brew install colima
brew install docker

Ryuk Test That Works

docker run -p 8080 -p 8080:8080 -v /var/run/docker.sock:/var/run/docker.sock testcontainers/ryuk:0.3.3
nc -v localhost 8080

SUCCESS

Ryuk Test That Does NOT Work

docker run -p 8080 -v /var/run/docker.sock:/var/run/docker.sock testcontainers/ryuk:0.3.3
docker inspect [new container id]
nc -v localhost [new port]

CONNECTION REFUSED

It seems like Colima and the Docker engine are not properly forwarding the default exposed port from Ryuk back out to the host.

EDIT: So, the problem is definitely with Colima/Lima and Ryuk

dnatic09 commented 2 years ago

Just proved it again that it is colima/lima:

image

bsideup commented 2 years ago

Sounds like a bug in colima, as I am sure it is reproducible with at least some other images too.

I think we have seen a similar bug in Docker Desktop that got fixed (@rnorth perhaps you remember the link?)

Jasstkn commented 2 years ago

@dnatic09 try to configure this variables: https://github.com/testcontainers/testcontainers-java/issues/5043#issuecomment-1038898766

dnatic09 commented 2 years ago

I'll try again tomorrow, but i symlinked my Colima home directory 'docker.sock' to'/var/run/docker.sock'.

That should take care of all cases.

kiview commented 2 years ago

@dnatic09 Please try the approach outlined in the comment first, I know of a couple of different Testcontainers users that got a working setup like this.

Also, your issue seems unrelated to what @Jasstkn reported, please open a dedicated issue if it persists. Your problem seems to be related to port publishing (rather than port mapping, which you proofed to work).

@bsideup We had a similar issue in Docker for Desktop on Windows, which was due to firewall config with regards to WSL VM and lead to published ports being opened in a blocked range, but only when --publish-all was used.

dnatic09 commented 2 years ago

@dnatic09 Please try the approach outlined in the comment first, I know of a couple of different Testcontainers users that got a working setup like this.

Also, your issue seems unrelated to what @Jasstkn reported, please open a dedicated issue if it persists. Your problem seems to be related to port publishing (rather than port mapping, which you proofed to work).

I just removed the symlink to /var/run/docker.sock and used the recommended environment variable settings. Same result. TestContainers can start Ryuk, but the port publishing is not working on the "host unspecified" Docker mapping.

nadworny commented 2 years ago

I'm getting the same exception while using Rancher Desktop 1.1.1 on MacOS. This seems to work for a couple of first unit tests but after some time I keep getting [testcontainers-ryuk] WARN org.testcontainers.utility.ResourceReaper - Can not connect to Ryuk at localhost:49167

dnatic09 commented 2 years ago

^^^ same. The issue randomly went away and then returned. Just inconsistent behavior with Lima/Colima and the port publishing. Let's just keep updating our docker environment and hope it goes away :(

nadworny commented 2 years ago

Well I never had a fully successful unit test run so it's difficult to work like that ;) It seems always to be stucked after 3-4 unit tests.

dnatic09 commented 2 years ago

Strategy of how you build tests is VERY important. I try to start only one container per test fixture and repurpose that container for ALL tests within that fixture. There are many ways to accomplish this and it requires a degree of creativity. Do not allow each micro-test to run its own version of the container. It is not even stable using Docker Desktop.