testcontainers / testcontainers-java

Testcontainers is a Java library that supports JUnit tests, providing lightweight, throwaway instances of common databases, Selenium web browsers, or anything else that can run in a Docker container.
https://testcontainers.org
MIT License
8k stars 1.64k forks source link

[Bug]: ryuk no such container / 404 with --selinux-enabled option #7177

Closed rocketraman closed 1 year ago

rocketraman commented 1 year ago

Module

Core

Testcontainers version

1.18.3

Using the latest Testcontainers version?

Yes

Host OS

Linux

Host Arch

x86_64

Docker version

Client:
Version:           20.10.23
API version:       1.41
Go version:        go1.20rc3
Git commit:        %{shortcommit_cli}
Built:             Sun Jan 29 17:25:05 2023
OS/Arch:           linux/amd64
Context:           default
Experimental:      true

Server:
Engine:
Version:          20.10.23
API version:      1.41 (minimum version 1.12)
Go version:       go1.20rc3
Git commit:       %{shortcommit_moby}
Built:            Sun Jan 29 17:25:05 2023
OS/Arch:          linux/amd64
Experimental:     false
containerd:
Version:          1.6.19
GitCommit:
runc:
Version:          1.1.7
GitCommit:
docker-init:
Version:          0.19.0
GitCommit:

What happened?

On my distro (Fedora CoreOS 38) docker has --selinux-enabled by default.

When running an integration test I get the error tc.testcontainers/ryuk:0.5.1 ERROR Could not start containerjava.lang.IllegalStateException: Wait strategy failed. Container is removed.

It took some time to figure out why this was happening. Using TESTCONTAINERS_RYUK_DISABLED=true worked around the issue. Removing --selinux-enabled from the docker daemon also seems to work around the issue.

Relevant log output

Full output:

[INFO] Running my.ITFoo
tc.testcontainers/ryuk:0.5.1 ERROR Could not start containerjava.lang.IllegalStateException: Wait strategy failed. Container is removed
at org.testcontainers.containers.GenericContainer.tryStart(GenericContainer.java:501)
at org.testcontainers.containers.GenericContainer.lambda$doStart$0(GenericContainer.java:344)
Caused by: org.testcontainers.containers.ContainerLaunchException: Timed out waiting for log output matching '.*Started.*'
at org.testcontainers.containers.wait.strategy.LogMessageWaitStrategy.waitUntilReady(LogMessageWaitStrategy.java:47)
at org.testcontainers.containers.wait.strategy.AbstractWaitStrategy.waitUntilReady(AbstractWaitStrategy.java:52)

tc.testcontainers/ryuk:0.5.1 ERROR There are no stdout/stderr logs available for the failed container
xy.devhaus.com/library/redis:6 ERROR Could not start containerorg.testcontainers.containers.ContainerLaunchException: Container startup failed for image testcontainers/ryuk:0.5.1
at org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:349)
at org.testcontainers.containers.GenericContainer.start(GenericContainer.java:322)
Caused by: org.rnorth.ducttape.RetryCountExceededException: Retry limit hit with exception
at org.rnorth.ducttape.unreliables.Unreliables.retryUntilSuccess(Unreliables.java:88)
at org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:334)
Caused by: org.testcontainers.containers.ContainerLaunchException: Could not create/start container
at org.testcontainers.containers.GenericContainer.tryStart(GenericContainer.java:553)
at org.testcontainers.containers.GenericContainer.lambda$doStart$0(GenericContainer.java:344)
Caused by: java.lang.IllegalStateException: Wait strategy failed. Container is removed
at org.testcontainers.containers.GenericContainer.tryStart(GenericContainer.java:501)
at org.testcontainers.containers.GenericContainer.lambda$doStart$0(GenericContainer.java:344)
Caused by: org.testcontainers.containers.ContainerLaunchException: Timed out waiting for log output matching '.*Started.*'
at org.testcontainers.containers.wait.strategy.LogMessageWaitStrategy.waitUntilReady(LogMessageWaitStrategy.java:47)
at org.testcontainers.containers.wait.strategy.AbstractWaitStrategy.waitUntilReady(AbstractWaitStrategy.java:52)


### Additional Information

_No response_
eddumelendez commented 1 year ago

Thanks for raising the issue, Can you try adding ryuk.container.privileged=true to ~/.testcontainers.properties, please?

joanbm commented 1 year ago

If you have SELinux enabled in both the OS and Docker, Ryuk does not work because it is unable to connect to the bind-mounted Docker UNIX socket. See https://github.com/mviereck/x11docker/wiki/SELinux for more details. The same issue applies to Podman.

For example, you should also be able to see that this command does not work on SELinux-enabled systems:

docker run --rm -i -v /var/run/docker.sock:/var/run/docker.sock docker:cli docker ps

Running Ryuk as a privileged container works around the issue because privileged containers don't have SELinux isolation.

I think Testcontainers should ship with a more fine-grained switch to just disable SELinux for Ryuk (instead of the current one to run Ryuk as a privileged container), or just always unconditionally disable SELinux for Ryuk since AFAICT this is currently the only way to work around this issue (in a sane way).

eddumelendez commented 1 year ago

This has been fixed in cc60cd2de6896721ac7d449d0277ec76ed65545a and it will part of the next release.

rocketraman commented 1 year ago

Thanks for raising the issue, Can you try adding ryuk.container.privileged=true to ~/.testcontainers.properties, please?

This works but the ryuk container never shuts down, and subsequent tests block until previous ryuk containers are stopped manually.

eddumelendez commented 1 year ago

which container runtime are you using? I tested it with Docker Desktop for Mac and it works as expected.

rocketraman commented 1 year ago

Podman on Fedora 38

eddumelendez commented 1 year ago

wonder if there is an issue in Podman about it. Are you using Podman 4.5.x or latest?

rocketraman commented 1 year ago

Yes

Name        : podman
Epoch       : 5
Version     : 4.5.1
Release     : 1.fc38
Architecture: x86_64
joanbm commented 1 year ago

@rocketraman Can you make sure that you have the latest update of systemd installed (currently: systemd-253.7-1.fc38) and reboot your system to make sure that you're using it?

There was a bug affecting recent systemd versions such as systemd 253.5 (link, another link) that caused this same behaviour you are observing.

rocketraman commented 1 year ago

@rocketraman Can you make sure that you have the latest update of systemd installed (currently: systemd-253.7-1.fc38) and reboot your system to make sure that you're using it?

There was a bug affecting recent systemd versions such as systemd 253.5 (link, another link) that caused this same behaviour you are observing.

Thanks for this. I am on 253.5 right now. Will give 253.7 a shot.

rocketraman commented 1 year ago

I can confirm that pulling systemd out of the mix by running podman system service -t 0 instead of using the systemd podman.socket solves the problem.

eddumelendez commented 1 year ago

great @rocketraman ! and thanks for sharing @joanbm !

joanbm commented 1 year ago

Nice! That problem gave me a bit of a headache recently, so I hope you didn't waste too much time on it :)

rocketraman commented 1 year ago

Nice! That problem gave me a bit of a headache recently, so I hope you didn't waste too much time on it :)

Given your timely post @joanbm , I did not! Thanks again.

sp1rs commented 1 year ago

Disabling Ryuk worked for me as mentioned here.