testcontainers / testcontainers-java

Testcontainers is a Java library that supports JUnit tests, providing lightweight, throwaway instances of common databases, Selenium web browsers, or anything else that can run in a Docker container.
https://testcontainers.org
MIT License
8.01k stars 1.65k forks source link

[Bug]: Cannot add docker network alias as an advertised listener #9170

Closed YanivKunda closed 4 weeks ago

YanivKunda commented 1 month ago

Module

Redpanda

Testcontainers version

1.20.1

Using the latest Testcontainers version?

Yes

Host OS

Mac

Host Arch

ARM

Docker version

Client:
 Version:           27.1.1
 API version:       1.46
 Go version:        go1.21.12
 Git commit:        6312585
 Built:             Tue Jul 23 19:54:12 2024
 OS/Arch:           darwin/arm64
 Context:           desktop-linux

Server: Docker Desktop 4.33.0 (160616)
 Engine:
  Version:          27.1.1
  API version:      1.46 (minimum version 1.24)
  Go version:       go1.21.12
  Git commit:       cc13f95
  Built:            Tue Jul 23 19:57:14 2024
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.7.19
  GitCommit:        2bf793ef6dc9a18e00cb12efb64355c2c9d5eb41
 runc:
  Version:          1.7.19
  GitCommit:        v1.1.13-0-g58aa920
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

What happened?

I tried to use withListener(() -> "mykafka:9092") to add a listener using the network alias added via withNetworkAliases("mykafka"), but container startup failed due to the fact that withListener adds the given argument both as an advertised listener and an actual listener - the latter with the 0.0.0.0 network address, which conflicts with the default kafka_api listener bound on 0.0.0.0:9092

Relevant log output

024-08-29 12:22:35 INFO  2024-08-29 09:22:35,414 [shard 0:main] main - application.cc:795 - redpanda.advertised_kafka_api:{{external:{host: localhost, port: 59245}}, {internal:{host: 127.0.0.1, port: 9093}}, {mykafka:{host: mykafka, port: 9092}}} - Address of Kafka API published to the clients
2

...

2024-08-29 12:22:35 INFO  2024-08-29 09:22:35,414 [shard 0:main] main - application.cc:795 - redpanda.kafka_api:{{external:{host: 0.0.0.0, port: 9092}:{none}}, {internal:{host: 0.0.0.0, port: 9093}:{none}}, {mykafka:{host: 0.0.0.0, port: 9092}:{none}}}    - Address and port of an interface to listen for Kafka API requests

...

2024-08-29 12:22:35 INFO  2024-08-29 09:22:35,457 [shard 0:main] kafka - server.cc:41 - Creating net::server for kafka_rpc with config {{external://0.0.0.0:9092:PLAINTEXT}{internal://0.0.0.0:9093:PLAINTEXT}{mykafka://0.0.0.0:9092:PLAINTEXT}, max_service_memory_per_core: 283467840, metrics_enabled:true, listen_backlog:{nullopt}, tcp_recv_buf:{nullopt}, tcp_send_buf:{nullopt}, stream_recv_buf:{nullopt}}

...

2024-08-29 12:22:35 ERROR 2024-08-29 09:22:35,836 [shard 0:main] main - application.cc:474 - Failure during startup: std::runtime_error (kafka rpc protocol - Error attempting to listen on {mykafka://0.0.0.0:9092:PLAINTEXT}: std::__1::system_error (error system:98, posix_listen failed for address 0.0.0.0:9092: Address already in use))

Additional Information

It looks like the problem is that both the RedpandaContainer API only supports adding "Listeners" and the redpanda.yaml.ftl freemarker template interprets it as both the listeners and advertised listeners - I think the solution would be:

  1. Add a advertisedListenersValueSupplier set to RedpandaContainer, in addition to the existing listenersValueSupplier set
  2. Add a withAdvertisedListener method to RedpandaContainer to add an advertised listener just to advertisedListenersValueSupplier
  3. Retrofit withListener to add a listener to both lists (to maintain backward compatibility)
  4. Change the template to create the list under advertised_kafka_api from the new set
YanivKunda commented 1 month ago

Related to https://github.com/testcontainers/testcontainers-java/issues/6395

YanivKunda commented 1 month ago

I actually went ahead and tried to implement my idea - but when I added a test to try using withAdvertisedListener, it didn't work - Using kcat to access the advertised listener just hung.

eddumelendez commented 1 month ago

Hi, are you trying to connect to redpanda from another container? Then, this should work.

eddumelendez commented 1 month ago

FTR there is a test for what you just described. See https://github.com/testcontainers/testcontainers-java/blob/main/modules/redpanda/src/test/java/org/testcontainers/redpanda/RedpandaContainerTest.java#L114-L141

If there is more information that you can provide, that would be helpful.

YanivKunda commented 1 month ago

I know, I based my test on this one -

            RedpandaContainer redpanda = new RedpandaContainer("docker.redpanda.com/redpandadata/redpanda:v23.1.7")
                .withNetworkAliases("redpanda")
                .withAdvertisedListener(() -> "redpanda:9092")
                .withNetwork(network);

but as I said, it failed. Trying to debug it (using the same command the test issues on kcat) I got the following:

sh-4.4$ kcat -b redpanda:9092 -t msgs -P -l /data/msgs.txt
%3|1724958737.983|FAIL|rdkafka#producer-1| [thrd:localhost:63177/0]: localhost:63177/0: Connect to ipv4#127.0.0.1:63177 failed: Connection refused (after 1ms in state CONNECT)
% ERROR: Local: Broker transport failure: localhost:63177/0: Connect to ipv4#127.0.0.1:63177 failed: Connection refused (after 1ms in state CONNECT)

Which recreates the issue I have in my app (container -> RP TC) - in this case RP returns the mapped port advertised listener. But I think at least one of the causes is that I'm trying to create an advertised listener on the same standard 9092 port.

If I use another port (like in the existing test), I still get an error, but from a different listener:

sh-4.4$ kcat -b redpanda:19092 -t msgs -P -l /data/msgs.txt
%3|1724959011.160|FAIL|rdkafka#producer-1| [thrd:redpanda:19092/bootstrap]: redpanda:19092/bootstrap: Connect to ipv4#172.18.0.2:19092 failed: Connection refused (after 1ms in state CONNECT)
% ERROR: Local: Broker transport failure: redpanda:19092/bootstrap: Connect to ipv4#172.18.0.2:19092 failed: Connection refused (after 1ms in state CONNECT)
% ERROR: Local: All broker connections are down: 1/1 brokers are down: terminating
sh-4.4$ 

Here RP resolves the correct IP from the Docker DNS, but there is no listener on that port, so it doesn't work.

The root cause is probably that the broker returns the first matching advertised listener corresponding to the listener the client connected to - which in the case of 9092 is the first (default) one.

The existing test actually provides an easy workaround for my problem, so I'm not sure this ticket needs the solution I suggested.

eddumelendez commented 1 month ago

you mean withListener instead of withAdvertisedListener, right?

Well, I'd like to clarify things first. What you just described in What happened? section should work when you are connecting from other container in the same network.

Trying to debug it (using the same command the test issues on kcat) I got the following:

from inside the container or from your host machine? If you are trying from your host machine then it will not work.

I've been adding some improvements to KafkaContainer in order to work around external clients. See KafkaContainer but those haven't been moved to RedpandaContainer yet.

YanivKunda commented 1 month ago

Yes, I meant using the existing withListener as a workaround - but only when using a port different from 9092. And what I described happened from another container (for the example I've used the confluentinc/cp-kcat image used in the tests) - but I think it didn't work because when using withListener with port 9092 (which already has a listener on 0.0.0.0) the broker can't create the new listener on a different IP because of 0.0.0.0's special role.

eddumelendez commented 1 month ago

Hi @YanivKunda, I think the issue is that you are trying to use an existing port 9092 which is being used. Have you tried with another port like in the examples?

YanivKunda commented 1 month ago

@eddumelendez exactly! That's what I wrote in my previous comment - and indeed I used another port for a workaround.

eddumelendez commented 1 month ago

Can two listeners be declared in the same port? Sorry, but I'm not a kafka expert, so, any documentation that you can point me would be helpful to fix it.

YanivKunda commented 4 weeks ago

Generally in networks you can start two sockets on the same port, but of course in different IPs - Obviously, if one of them is 0.0.0.0 that other wouldn't be able to bind. I don't think Kafka adds any restriction of its own, but I think I'm just going to close this issue since the workaround is more than enough.