testcontainers / testcontainers-go

Testcontainers for Go is a Go package that makes it simple to create and clean up container-based dependencies for automated integration/smoke tests. The clean, easy-to-use API enables developers to programmatically define containers that should be run as part of a test and clean up those resources when the test is done.
https://golang.testcontainers.org
MIT License
3.46k stars 473 forks source link

[Bug]: intermittent "port not found" error preventing Kafka and generic containers from starting #2670

Open aakso opened 1 month ago

aakso commented 1 month ago

Testcontainers version

0.32.0

Using the latest Testcontainers version?

Yes

Host OS

MacOS

Host arch

ARM

Go version

1.21.12

Docker version

Client:
 Version:           27.1.1
 API version:       1.46
 Go version:        go1.21.12
 Git commit:        6312585
 Built:             Tue Jul 23 19:54:12 2024
 OS/Arch:           darwin/arm64
 Context:           desktop-linux

Server: Docker Desktop 4.33.0 (160616)
 Engine:
  Version:          27.1.1
  API version:      1.46 (minimum version 1.24)
  Go version:       go1.21.12
  Git commit:       cc13f95
  Built:            Tue Jul 23 19:57:14 2024
  OS/Arch:          linux/arm64
  Experimental:     true
 containerd:
  Version:          1.7.19
  GitCommit:        2bf793ef6dc9a18e00cb12efb64355c2c9d5eb41
 runc:
  Version:          1.7.19
  GitCommit:        v1.1.13-0-g58aa920
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Docker info

Client:
 Version:    27.1.1
 Context:    desktop-linux
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.16.1-desktop.1
    Path:     /Users/aakso/.docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.29.1-desktop.1
    Path:     /Users/aakso/.docker/cli-plugins/docker-compose
  debug: Get a shell into any image or container (Docker Inc.)
    Version:  0.0.34
    Path:     /Users/aakso/.docker/cli-plugins/docker-debug
  desktop: Docker Desktop commands (Alpha) (Docker Inc.)
    Version:  v0.0.14
    Path:     /Users/aakso/.docker/cli-plugins/docker-desktop
  dev: Docker Dev Environments (Docker Inc.)
    Version:  v0.1.2
    Path:     /Users/aakso/.docker/cli-plugins/docker-dev
  extension: Manages Docker extensions (Docker Inc.)
    Version:  v0.2.25
    Path:     /Users/aakso/.docker/cli-plugins/docker-extension
  feedback: Provide feedback, right in your terminal! (Docker Inc.)
    Version:  v1.0.5
    Path:     /Users/aakso/.docker/cli-plugins/docker-feedback
  init: Creates Docker-related starter files for your project (Docker Inc.)
    Version:  v1.3.0
    Path:     /Users/aakso/.docker/cli-plugins/docker-init
  sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
    Version:  0.6.0
    Path:     /Users/aakso/.docker/cli-plugins/docker-sbom
  scout: Docker Scout (Docker Inc.)
    Version:  v1.11.0
    Path:     /Users/aakso/.docker/cli-plugins/docker-scout

Server:
 Containers: 63
  Running: 28
  Paused: 0
  Stopped: 35
 Images: 267
 Server Version: 27.1.1
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 2bf793ef6dc9a18e00cb12efb64355c2c9d5eb41
 runc version: v1.1.13-0-g58aa920
 init version: de40ad0
 Security Options:
  seccomp
   Profile: unconfined
  cgroupns
 Kernel Version: 6.10.0-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: aarch64
 CPUs: 10
 Total Memory: 31.3GiB
 Name: docker-desktop
 ID: 02318c78-1ce1-45a9-85fe-81f705b23f70
 Docker Root Dir: /var/lib/docker
 Debug Mode: true
  File Descriptors: 289
  Goroutines: 274
  System Time: 2024-07-26T11:25:07.547170422Z
  EventsListeners: 16
 HTTP Proxy: http.docker.internal:3128
 HTTPS Proxy: http.docker.internal:3128
 No Proxy: hubproxy.docker.internal
 Labels:
  com.docker.desktop.address=unix:///Users/aakso/Library/Containers/com.docker.docker/Data/docker-cli.sock
 Experimental: true
 Insecure Registries:
  hubproxy.docker.internal:5555
  127.0.0.0/8
 Live Restore Enabled: false
 Default Address Pools:
   Base: 192.168.128.0/17, Size: 24

WARNING: daemon is not using the default seccomp profile

What happened?

related to #2543

Unfortunately, the "port not found" error still seems to be present with v0.32.0.

I see it mainly with the Kafka module but also occasionally with testcontainers.GenericContainer too.

The code snippet I'm using the start the Kafka test container:

    container, err := kafka.Run(context.Background(), "confluentinc/confluent-local:7.5.0",
        tckafka.WithClusterID("test-kafka"),
        testcontainers.WithLogger(testcontainers.TestLogger(t)),
    )
    require.NoError(t, err, "RunContainer")

So pretty standard.

Relevant log output

## Using Kafka module

2024/07/26 13:34:14 github.com/testcontainers/testcontainers-go - Connected to docker: 
  Server Version: 27.1.1
  API Version: 1.46
  Operating System: Docker Desktop
  Total Memory: 32046 MB
  Testcontainers for Go Version: v0.32.0
  Resolved Docker Host: unix:///var/run/docker.sock
  Resolved Docker Socket Path: /var/run/docker.sock
  Test SessionID: fdf2809a5158c9725548a7f2151f954e32b0a25616f182e596f60b7426bfae62
  Test ProcessID: 3704bc3c-2bdf-4232-a3fb-e968a0995440
    lifecycle.go:62: 🐳 Creating container for image testcontainers/ryuk:0.7.0
    lifecycle.go:68: ✅ Container created: bffb711625a3
    lifecycle.go:74: 🐳 Starting container: bffb711625a3
    lifecycle.go:80: ✅ Container started: bffb711625a3
    lifecycle.go:263: ⏳ Waiting for container id bffb711625a3 image: testcontainers/ryuk:0.7.0. Waiting for: &{Port:8080/tcp timeout:<nil> PollInterval:100ms}
    lifecycle.go:86: 🔔 Container is ready: bffb711625a3
    lifecycle.go:62: 🐳 Creating container for image confluentinc/confluent-local:7.5.0
    lifecycle.go:68: ✅ Container created: a87a46618629
    lifecycle.go:74: 🐳 Starting container: a87a46618629
    lifecycle.go:80: ✅ Container started: a87a46618629
    lifecycle.go:333: container logs (port not found
        context deadline exceeded):

    kafka_test.go:22: 
                Error Trace:    /Users/aakso/git/upcloud/REDACTED/kafka_test.go:22
                Error:          Received unexpected error:
                                failed to start container: port not found
                                context deadline exceeded
                Test:           REDACTED
                Messages:       RunContainer

## Using GenericContainer
    daemon_internal_test.go:57: 
            Error Trace:    /Users/aakso/git/REDACTED/daemon_internal_test.go:57
            Error:          Received unexpected error:
                            port not found: creating reaper failed: failed to create container
            Test:           REDACTED
            Messages:       testcontainers.GenericContainer

Additional information

I'm able to mitigate the error for Kafka module by wrapping the PostStarts lifecycle hooks and retrying on error. In my case it's the first hook that always fails. It seems to fail to c.MappedPort call. So far, a single retry has been enough to mitigate the problem.

ar-sematell commented 1 month ago

Same issue, solved with:

// HOTFIX: testcontainers-go v0.32.0
// https://github.com/testcontainers/testcontainers-go/issues/2670

type hotfix2670 struct{}

func (h hotfix2670) Customize(req *testcontainers.GenericContainerRequest) error {
    originalHook := req.LifecycleHooks[0].PostStarts[0]
    req.LifecycleHooks[0].PostStarts[0] = func(ctx context.Context, container testcontainers.Container) error {
        var err error
        for retry := 0; retry < 10; retry++ {
            err = originalHook(ctx, container)
            if err == nil {
                break
            }
            time.Sleep(time.Second)
        }
        return err
    }
    return nil
}

func setupKafka(t *testing.T, ctx context.Context) *kafka.KafkaContainer {
    t.Helper()

    testcontainers.Logger = log.New(io.Discard, "", 0)
    kafkaContainer, err := kafka.Run(ctx, "confluentinc/confluent-local:7.5.0",
        kafka.WithClusterID("test-cluster"),
        hotfix2670{},
    )
    require.NoError(t, err)

    t.Cleanup(func() { require.NoError(t, kafkaContainer.Terminate(ctx)) })
    return kafkaContainer
}
aakso commented 1 month ago

@ar-sematell yeah, my solution was very similar. Unfortunately, it only tackles the LifecycleHooks part.

The second problem with GenericContainer can probably be mitigated only with retrying the entire operation.

stevenh commented 1 month ago

Can you check if https://github.com/testcontainers/testcontainers-go/pull/2696 fixes this?

aakso commented 2 weeks ago

@stevenh I tested with pseudo module version of v0.32.1-0.20240814110719-4545c292e7b8 and it seems the issue is resolved.

I tried with few batches of 10 and with v0.32.0, about 50% of container starts failed. With the new patch, I don't see any failures.

mdelapenya commented 2 weeks ago

@aakso we released v0.33.0 Could you please check with that version?

If fixed, I think we can close this one, thanks!

tigerquoll commented 1 week ago

Unforunately I get: "create container: port not found: creating reaper failed" errors with both v0.32.1-0.20240814110719-4545c292e7b8 and v0.33.0

stevenh commented 1 week ago

Looks like this will need to reaper rework, do you have a simple test case?

tigerquoll commented 6 days ago

Unfortunately the test case is relatively complicated, it will take a bit of effort to boil it down to a simpler examplar