Open jeffdeville opened 2 years ago
I believe ::1
is the loopback address for IPv6, equivalent to 127.0.0.1
in IPv4. ::
should be the equivalent of 0.0.0.0
.
I have been able to get a single node broker working on Fly.io by using the socket address [::]
and the advertised address of application-name-broker-1.internal
. I was also able to connect redpanda console to this node.
However, I have not been able to get any other brokers to join the cluster. It fails on the join request. After debugging for several days I haven't been able to resolve it :{ It seems like the RPC listener is not working for IPv6 given that redpanda console works fine.
2022-10-22T22:04:50.600 app[d9475034] sea [info] INFO 2022-10-22 22:04:50,599 [shard 0] cluster - members_manager.cc:388 - Sending join request to {host: application-name-broker-1.internal, port: 33145}
2022-10-22T22:04:55.595 app[d9475034] sea [info] WARN 2022-10-22 22:04:55,595 [shard 0] cluster - config_manager.cc:135 - Exception during bootstrap: seastar::timed_out_error (timedout)
@rupurt Have you solved this somehow?
@tobiaslins no unfortunately. I was in contact with Redpanda support but they couldn't get to the bottom of it. I gave up...
I'm still fairly confident that there is a strong lead on the RPC listener not bound to IPv6. rpk
may also have a bug where it doesn't resolve IPv6 making it harder to debug the root of the problem.
ssh into broker-1 trying to list topics from broker-2 doesn't work
➜ atlas-core-redpanda git:(main) ✗ fly ssh console -a tokenalysis-development-core-redpanda-broker-1
Update available 0.0.417 -> v0.0.418.
Run "fly version update" to upgrade.
Connecting to fdaa:0:72cf:a7b:2c60:2:754e:2... complete
# rpk topic list --brokers tokenalysis-development-core-redpanda-broker-2.internal:9092
unable to request metadata: unable to dial: dial tcp [fdaa:0:72cf:a7b:2c60:2:75a3:2]:9092: connect: connection refused
ssh into broker-2 listing topics from broker-1 does work
➜ atlas-core-redpanda git:(main) ✗ fly ssh console -a tokenalysis-development-core-redpanda-broker-2
Update available 0.0.417 -> v0.0.418.
Run "fly version update" to upgrade.
Connecting to fdaa:0:72cf:a7b:2c60:2:75a3:2... complete
# rpk topic list
unable to request metadata: unable to dial: dial tcp 0.0.0.0:9092: connect: connection refused
# rpk topic list --brokers tokenalysis-development-core-redpanda-broker-1.internal:9092
NAME PARTITIONS REPLICAS
Any progress with this? Run into the same issue :(
It appeared from FLY.IO docs that fly-local-6pn
is an alias for the IP v6 address of the app.
For a service to be accessible via its 6PN address, it needs to bind to/listen on fly-local-6pn. For example, if you have a service running on port 8080, you need to bind it to fly-local-6pn:8080 for it to be accessible at “[6PN_Address:8080]”.
So I was able to run Redpanda on FLY.IO using the below fly.toml
:
# fly.toml app configuration file generated for secondhand on 2024-02-04T23:32:33+01:00
#
# See https://fly.io/docs/reference/configuration/ for information about how to use this file.
#
app = 'YOURAPP'
primary_region = 'lhr'
[build]
image = "docker.redpanda.com/redpandadata/redpanda:latest"
[processes]
panda = "redpanda start --smp '1' --memory 512M --reserve-memory 0M --kafka-addr FLY://fly-local-6pn:9092 --advertise-kafka-addr FLY://YOURAPP.internal:9092 --pandaproxy-addr FLY://fly-local-6pn:8082 --advertise-pandaproxy-addr FLY://YOURAPP.internal:8082"
[env]
REDPANDA_BROKERS = "YOURAPP.internal:9092"
[[vm]]
cpu_kind = 'shared'
cpus = 1
memory_mb = 1024
[mounts]
source="redpanda_data"
destination="/var/lib/redpanda/data"
Replace YOURAPP
with the actual name of your FLY.IO application.
Please keep in mind that this is a 1-node dev-only configuration, and it's probably dangerous to run the real production workloads with it.
I believe the failure is due to AAAA DNS lookups on the IPv6 advertised address. When I use an AAAA host name for my advertise addresses, no nodes are able to join the cluster as you reported. When I use a static ipv6 address for the advertise address, it works fine and is able to form a healthy cluster with a raft quorum.
Version & Environment
Redpanda version: (use
rpk version
): Server: docker.redpanda.com/vectorized/redpanda:v22.1.6 Client: v22.1.5 (rev 042089c50e0c5d148a2d49f5dcf1bcdfa419be3a) Cloud: https://fly.io/ Client OS: MacOS MontereyI've created a fly.toml to run redpanda that looks like this:
I then created a persistent volume with:
fly volumes create redpanda_poc --size 1
And then deployed:
fly deploy --app redpanda-1
What went wrong?
rpk topic list --brokers "redpanda-1.internal:9092" -vvv [DEBUG] opening connection to broker; addr: redpanda-1.internal:9092, broker: seed 0 [WARN] unable to open connection to broker; addr: redpanda-1.internal:9092, broker: seed 0, err: dial tcp [fdaa:0:4939:a7b:ab2:1:4e05:2]:9092: connect: connection refused unable to request metadata: unable to dial: dial tcp [fdaa:0:4939:a7b:ab2:1:4e05:2]:9092: connect: connection refused
What should have happened instead?
Should have seen a list of products
How to reproduce the issue?
If this is an issue with IPV6, then presumably it could be replicated with Docker, but Docker only supports IPV6 on linux, and I'm on a mac. So the only way I know to emulate it is by creating an account and deploying to fly.io
fly volumes create redpanda_poc --size 1
fly deploy --app redpanda-1
Additional information
Note that if you connect directly to the machine, it will work:
fly ssh console -a redpanda-1
rpk topic list --brokers "[::1]:9092"
This command does workI'm engaging the Fly team to try and help me diagnose the issue as well https://community.fly.io/t/redpanda-kafka-clone-cant-connect/6221
Please attach any relevant logs, backtraces, or metric charts.
JIRA Link: CORE-991