testcontainers / testcontainers-go

Testcontainers for Go is a Go package that makes it simple to create and clean up container-based dependencies for automated integration/smoke tests. The clean, easy-to-use API enables developers to programmatically define containers that should be run as part of a test and clean up those resources when the test is done.
https://golang.testcontainers.org
MIT License
3.5k stars 478 forks source link

[Bug]: Running multiple tests exhausts the network #2764

Open ErikEngerd opened 2 weeks ago

ErikEngerd commented 2 weeks ago

Testcontainers version

0.33.0

Using the latest Testcontainers version?

Yes

Host OS

Rocky linux 9.4

Host arch

x86-64

Go version

1.23.0

Docker version

Client: Docker Engine - Community
 Version:           27.1.2
 API version:       1.46
 Go version:        go1.21.13
 Git commit:        d01f264
 Built:             Mon Aug 12 11:52:33 2024
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          27.1.2
  API version:      1.46 (minimum version 1.24)
  Go version:       go1.21.13
  Git commit:       f9522e5
  Built:            Mon Aug 12 11:50:54 2024
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.7.20
  GitCommit:        8fc6bcff51318944179630522a095cc9dbf9f353
 runc:
  Version:          1.1.13
  GitCommit:        v1.1.13-0-g58aa920
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Docker info

Client: Docker Engine - Community
 Version:    27.1.2
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.16.2
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.29.1
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 4
  Running: 4
  Paused: 0
  Stopped: 0
 Images: 102
 Server Version: 27.1.2
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 8fc6bcff51318944179630522a095cc9dbf9f353
 runc version: v1.1.13-0-g58aa920
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 5.14.0-284.11.1.el9_2.x86_64
 Operating System: Rocky Linux 9.4 (Blue Onyx)
 OSType: linux
 Architecture: x86_64
 CPUs: 12
 Total Memory: 15.22GiB
 Name: buzzard
 ID: 47923fe3-a119-4b3e-821b-04b9f72ee83a
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Username: wamblee
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

What happened?

I am running tests in go with the -count=100 option. Now, while running the tests I am monitoring the networks created with watch docker network ls. What I am seeing is that no cleanup is taking place between runs. The cleanup happens but only afterwards.

The tests finally breaks off with an error message.

Relevant log output

=== RUN   Test_NetworkExhaustion
2024/09/02 19:53:08 ERROR Error response from daemon: all predefined address pools have been fully subnetted: failed to create network

Additional information

The test case is as follows:

func Test_NetworkExhaustion(t *testing.T) {
    _, err := network.New(context.Background())
    if err != nil {
        log.Printf("ERROR %+v", err)
        t.FailNow()
    }
}

I think testcontainers should support running tests with '-count=100' and should cleanup between test repetitions. Also, it should cleanup between different tests but I did not test that.

Even when configuring ryuk with a lower reconnection timeout (1 second) does not fix the issue. Cleanup only starts after the test has crashed.

ErikEngerd commented 2 weeks ago

Of course, doing defer net.Remove() also fixes the issue, but still I think that cleanup by ryuk should already start while the tests are running.

mdelapenya commented 2 weeks ago

@ErikEngerd thanks for raising this issue. Ryuk starts pruning the resources created by testcontainers when there are no more connections from the given test session. Thinking about your use case, how does Ryuk know when a network needs to be pruned based on your current run needs?

Because testcontainers APIs offer the way to manually terminate/remote resources, I do not see it as a bug, although I'd like to know if you have ideas on how to address what you are seeing in your builds.

mdelapenya commented 2 weeks ago

One more question: how many Ryuk containers are started using -count=100? Just one, or a hundred?