Open awagner-iq opened 1 year ago
I am seeing this as well on 0.17.0 with mysql image, test sometimes fails, more often on CI. My configuration is the same as presented except I am using normal Docker and don't skip reaper. time.Sleep(5 * time.Second)
didn't add stability but time.Sleep(30 * time.Second)
does seem to have helped - presumably this is the time of the relatively slow MySQL container itself and the wait condition is getting missed for some reason.
For context, the build starts and terminates a database container about three times, perhaps having multiple start/stop in the same binary run (in this case go test ./...
) causes a race condition?
@anuraaga @awagner-iq thanks for open this issue, and sorry for the radio silence, I went on paternity leave at the beginning of December and probably skipped this ticket.
I'm investigating why it fails for your use case although, with the above snippet, I'm not able to reproduce it yet. Debugging...
I'm noticing this as well while running on Docker 20.10 on Linux - it's resulting in a fair bit of flakiness, especially in our CI pipeline. It doesn't look like this is strictly related to Podman.
It seems to be a bug in the Endpoint function. I'm using docker on linux and I can reproduce this easily on my machine and in ci. Now I create the URL myself for a specific port (not the first one, as Endpoint does it) and I did not see this issue yet.
It seems to be a bug in the Endpoint function. I'm using docker on linux and I can reproduce this easily on my machine and in ci. Now I create the URL myself for a specific port (not the first one, as Endpoint does it) and I did not see this issue yet.
Could you please add a repro snippet, including which version of the library are you using? 🙏 If there are so many users seeing it, we could be in front of a real bug that we should fix 🐞
Yes would be great if this one can go up for priority because it is still occurring , and it is breaking a lot of CI pipelines. This is code snippet:
github.com/testcontainers/testcontainers-go v0.19.0
req := testcontainers.ContainerRequest{
Image: "nats:latest",
Name: natsContainerName,
ExposedPorts: []string{"4222/tcp"},
Networks: networks,
WaitingFor: wait.ForAll(wait.ForLog("Server is ready")),
}
container, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{
ContainerRequest: req,
Started: true,
})
if err != nil {
return nil, err
}
natsMappedPort, err := container.MappedPort(ctx, "4222/tcp")
if err != nil {
log.Errorf("Failed to get nats port: %v", err)
return initFailure
}
natsHost, err := container.Host(ctx)
if err != nil {
log.Errorf("Failed to get nats host: %v", err)
return initFailure
}
natsEndpoint, err := container.Endpoint(ctx, "")
if err != nil {
log.Errorf("Failed to get nats endpoint: %v", err) --- error occurs in here
return initFailure
}
Could you please add a repro snippet, including which version of the library are you using? pray If there are so many users seeing it, we could be in front of a real bug that we should fix lady_beetle
Hi, I did not have the time yet to create a repro snippet. The failing code I have is something I cannot share.
I switched all my Endpoint(...)
calls to PortEndpoint(...)
and explicitly specified the port. I did not have any issues since then. So I think the reason is that the first port that gets detected somehow does not work.
What I'd do given some time:
Endpoint(...)
. In my case I get the endpoint immediately after creating the container.I used a GenericContainer with mongo:4.0, and testcontainers 0.19.0, but also switched to the latest version for a quick test and that did not fix the issue.
Maybe this already helps and gives someone the opportunity to reproduce and analyze this bug.
I'm able to reproduce this exact error in a VM using podman. Will take a look after my parental leave
In the mean time, checking the state of the Docker types, I saw this difference when inspecting a container with podman Vs docker:
Steps to reproduce
func (c *DockerContainer) Ports(ctx context.Context) (nat.PortMap, error) {
inspect, err := c.inspectContainer(ctx)
if err != nil {
return nil, err
}
+ fmt.Printf(">>> network.settings.ports: %+v\n", inspect.NetworkSettings.Ports)
+ fmt.Printf(">>> config.exposed.ports: %+v\n", inspect.Config.ExposedPorts)
return inspect.NetworkSettings.Ports, nil
}
go run gotest.tools/gotestsum --format short-verbose --packages="./..." -- -run "^TestContainerWithHostNetworkOptions_UseExposePortsFromImageConfigs" -timeout 600s -count=1 -v
With Podman (Ubuntu VM)
>>> network.settings.ports: map[80/tcp:[]]
>>> config.exposed.ports: map[80/tcp:{}]
With Docker (on Mac)
>>> network.settings.ports: map[80/tcp:[{HostIP:0.0.0.0 HostPort:57087}]]
>>> config.exposed.ports: map[80/tcp:{}]
So for some reason, the docker types in Podman are inconsistent according Docker 🤷 . Pinging @kiview @cristianrgreco @eddumelendez @HofmeisterAn for awareness while I'm out
Could be related to https://github.com/containers/podman/issues/17780 🤔
I'm seeing an intermittent port not found
error using Docker for Mac, I tried upgrading and am currently on Docker version 26.0.0, build 2ae903e
. I'm exposing it as a CustomizeRequest, e.g.:
func WithPort(port int) testcontainers.CustomizeRequestOption {
return func(req *testcontainers.GenericContainerRequest) {
...
req.ExposedPorts = append(req.ExposedPorts, fmt.Sprintf("%d/tcp", port))
}
}
and a wait strategy that passes:
wait.ForHTTP(endpoint).WithPort(containerPortWithProtocol).WithStartupTimeout(timeout),
If folks have any workarounds or tips that would be helpful.
EDIT: After investigating this issue further, it looks like docker inspect
returns results excluding the exposed port from the network settings, it might be related to https://github.com/moby/moby/issues/42860. A possible workaround would be to retry MappedPort
in Docker.go a few times internally 🤷♂️
Retrying MappedPort
didn't work for me so I ended up calling docker inspect
from the code. Ugly workaround but it works:
func getMappedPorts(containerID string) (nat.PortMap, error) {
cmd := exec.Command("docker", "inspect", "--format", "{{json .NetworkSettings.Ports}}", containerID)
output, err := cmd.CombinedOutput()
if err != nil {
return nil, fmt.Errorf("failed to execute docker inspect: %s, %v", output, err)
}
var ports nat.PortMap
if err = json.Unmarshal(output, &ports); err != nil {
return nil, fmt.Errorf("failed to parse output: %v", err)
}
return ports, nil
}
This issue started occurring for me only after installing minikube
on the local docker installation. However since then it is consistently failing.
However adding a manual delay and watching that the port is open seems to resolve this. I assume there is a timing issue here...
This occurs on Docker as well as podman
Also seeing this error with the postgres
container
postgresContainer, err := postgres.RunContainer(ctx, testcontainers.WithImage("postgres:14-alpine"),
postgres.WithUsername("username"), postgres.WithPassword("password"),
testcontainers.WithLogConsumers(logConsumer))
mappedPort, err := postgresContainer.MappedPort(ctx, nat.Port("5432/tcp"))
adding
time.Sleep(10 * time.Second)
helps resolve the issue.
This also seems to have helped
testcontainers.WithWaitStrategy(
wait.ForLog("database system is ready to accept connections").
WithOccurrence(2).
WithStartupTimeout(5*time.Second))
Testcontainers version
0.15.0
Using the latest Testcontainers version?
Yes
Host OS
Linux
Host Arch
x86_64
Go Version
1.19
Docker version
Docker info
What happened?
We are seeing test flakes with the error message
mysql.Endpoint() = port not found
. The code we use to create the container is attached below. In particular, we use bothwait.ForListeningPort("3306/tcp")
andwait.ForLog("port: 3306")
to make sure we wait until the container is running and available. However, it seems even with that,GenericContainer
will occasionally return without being ready, as evidenced by the fact thatEndpoint
returns an error about the port not being found.Relevant log output
No response
Additional Information