apache / pulsar

Apache Pulsar - distributed pub-sub messaging system
https://pulsar.apache.org/
Apache License 2.0
14.12k stars 3.57k forks source link

Cluster switching failed for producers #13148

Open zbye opened 2 years ago

zbye commented 2 years ago

Describe the bug

We have deployed two pulsar clusters and enabled geo-replication. We use domain names for load balancing and cluster switching when disaster occurs. For example, under normal circumstances, producers are connected to cluster A. When cluster A is down, producers can automatically connect to the backup cluster B to perform data operations.

However, in the actual test, it was found that after cluster A was down, even if the domain name was changed and took effect, the producers could not successfully reconnect to the backup cluster B.

To Reproduce Steps to reproduce the behavior:

  1. Producers connect to pulsar by pulsar://www.stu13.com:6650 (pulsar://10.187.128.67:6650)
  2. Stop pulsar cluster A
  3. Modify DNS servers, modify www.stu13.com to 10.187.128.66.
  4. The producers keep reconnecting to pulsar://10.187.128.67:6650 and reporting errors

Expected behavior The producers could reconnect to pulsar://10.187.128.66:6650

Screenshots If applicable, add screenshots to help explain your problem.

2021-12-06 11:01:47.293 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ConnectionPool - [[id: 0xc4441492, L:/10.187.128.66:43650 - R:10.187.128.226/10.187.128.226:6650]] Connected to server 2021-12-06 11:01:47.295 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ProducerImpl - [persistent://gmtp_nasdaq/US_test/metadata] [null] Creating producer on cnx [id: 0xc4441492, L:/10.187.128.66:43650 - R:10.187.128.226/10.187.128.226:6650] 2021-12-06 11:01:47.320 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ProducerImpl - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Created producer on cnx [id: 0xc4441492, L:/10.187.128.66:43650 - R:10.187.128.226/10.187.128.226:6650] 2021-12-06 11:01:47.472 [main] INFO o.s.scheduling.concurrent.ThreadPoolTaskExecutor - Initializing ExecutorService 'applicationTaskExecutor' 2021-12-06 11:01:47.658 [main] INFO org.apache.coyote.http11.Http11NioProtocol - Starting ProtocolHandler ["http-nio-4568"] 2021-12-06 11:01:47.711 [main] INFO o.s.boot.web.embedded.tomcat.TomcatWebServer - Tomcat started on port(s): 4568 (http) with context path '' 2021-12-06 11:01:47.728 [main] INFO com.gtja.pulsardemo.pulsartest.StartApplication - Started StartApplication in 3.401 seconds (JVM running for 3.863) 2021-12-06 11:01:47.752 [main] INFO com.scurrilous.circe.checksum.Crc32cIntChecksum - SSE4.2 CRC32C provider initialized send, msg_key= key_0 send, msg_key= key_1 send, msg_key= key_2 send, msg_key= key_3 2021-12-06 11:02:02.612 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ClientCnx - [10.187.128.226/10.187.128.226:6650] Broker notification of Closed producer: 0 2021-12-06 11:02:02.613 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Closed connection [id: 0xc4441492, L:/10.187.128.66:43650 - R:10.187.128.226/10.187.128.226:6650] -- Will try again in 0.1 s 2021-12-06 11:02:02.684 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ClientCnx - [id: 0xc4441492, L:/10.187.128.66:43650 ! R:10.187.128.226/10.187.128.226:6650] Disconnected 2021-12-06 11:02:02.716 [pulsar-timer-5-1] INFO org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Reconnecting after timeout 2021-12-06 11:02:02.723 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ConnectionPool - [[id: 0x78f7ae43, L:/10.187.128.66:40850 - R:10.187.128.67/10.187.128.67:6650]] Connected to server 2021-12-06 11:02:02.726 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ProducerImpl - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Creating producer on cnx [id: 0x78f7ae43, L:/10.187.128.66:40850 - R:10.187.128.67/10.187.128.67:6650] send, msg_key= key_4 2021-12-06 11:02:02.822 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ProducerImpl - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Created producer on cnx [id: 0x78f7ae43, L:/10.187.128.66:40850 - R:10.187.128.67/10.187.128.67:6650] 2021-12-06 11:02:02.823 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ProducerImpl - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Re-Sending 1 messages to server send, msg_key= key_5 send, msg_key= key_6 send, msg_key= key_7 send, msg_key= key_8 send, msg_key= key_9 send, msg_key= key_10 2021-12-06 11:02:23.365 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ClientCnx - [10.187.128.67/10.187.128.67:6650] Broker notification of Closed producer: 0 2021-12-06 11:02:23.366 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Closed connection [id: 0x78f7ae43, L:/10.187.128.66:40850 - R:10.187.128.67/10.187.128.67:6650] -- Will try again in 0.1 s 2021-12-06 11:02:23.468 [pulsar-timer-5-1] INFO org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Reconnecting after timeout 2021-12-06 11:02:23.475 [pulsar-client-io-1-1] ERROR org.apache.pulsar.client.impl.ClientCnx - [id: 0xae04dadd, L:/10.187.128.66:40846 - R:www.stu13.com/10.187.128.67:6650] Close connection because received internal-server error No broker was available to own persistent://gmtp_nasdaq/US_test/metadata 2021-12-06 11:02:23.478 [pulsar-client-io-1-1] WARN o.a.pulsar.client.impl.BinaryProtoLookupService - [persistent://gmtp_nasdaq/US_test/metadata] failed to send lookup request : No broker was available to own persistent://gmtp_nasdaq/US_test/metadata 2021-12-06 11:02:23.480 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Error connecting to broker: org.apache.pulsar.client.api.PulsarClientException$LookupException: No broker was available to own persistent://gmtp_nasdaq/US_test/metadata 2021-12-06 11:02:23.481 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Could not get connection to broker: org.apache.pulsar.client.api.PulsarClientException$LookupException: No broker was available to own persistent://gmtp_nasdaq/US_test/metadata -- Will try again in 0.19 s 2021-12-06 11:02:23.482 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ClientCnx - [id: 0xae04dadd, L:/10.187.128.66:40846 ! R:www.stu13.com/10.187.128.67:6650] Disconnected 2021-12-06 11:02:23.673 [pulsar-timer-5-1] INFO org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Reconnecting after connection was closed 2021-12-06 11:02:23.677 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ConnectionPool - [[id: 0x129ed38e, L:/10.187.128.66:40852 - R:www.stu13.com/10.187.128.67:6650]] Connected to server 2021-12-06 11:02:23.682 [pulsar-client-io-1-1] ERROR org.apache.pulsar.client.impl.ClientCnx - [id: 0x129ed38e, L:/10.187.128.66:40852 - R:www.stu13.com/10.187.128.67:6650] Close connection because received internal-server error No broker was available to own persistent://gmtp_nasdaq/US_test/metadata 2021-12-06 11:02:23.683 [pulsar-client-io-1-1] WARN o.a.pulsar.client.impl.BinaryProtoLookupService - [persistent://gmtp_nasdaq/US_test/metadata] failed to send lookup request : No broker was available to own persistent://gmtp_nasdaq/US_test/metadata 2021-12-06 11:02:23.684 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Error connecting to broker: org.apache.pulsar.client.api.PulsarClientException$LookupException: No broker was available to own persistent://gmtp_nasdaq/US_test/metadata 2021-12-06 11:02:23.685 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Could not get connection to broker: org.apache.pulsar.client.api.PulsarClientException$LookupException: No broker was available to own persistent://gmtp_nasdaq/US_test/metadata -- Will try again in 0.368 s 2021-12-06 11:02:23.686 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ClientCnx - [id: 0x129ed38e, L:/10.187.128.66:40852 ! R:www.stu13.com/10.187.128.67:6650] Disconnected 2021-12-06 11:02:23.890 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ClientCnx - [id: 0x78f7ae43, L:/10.187.128.66:40850 ! R:10.187.128.67/10.187.128.67:6650] Disconnected send, msg_key= key_11 2021-12-06 11:02:24.055 [pulsar-timer-5-1] INFO org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Reconnecting after connection was closed 2021-12-06 11:02:24.064 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionPool - Failed to open connection to www.stu13.com:6650 : org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 2021-12-06 11:02:24.066 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Error connecting to broker: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 2021-12-06 11:02:24.066 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Could not get connection to broker: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 -- Will try again in 0.773 s 2021-12-06 11:02:24.841 [pulsar-timer-5-1] INFO org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Reconnecting after connection was closed 2021-12-06 11:02:24.857 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionPool - Failed to open connection to www.stu13.com:6650 : org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 2021-12-06 11:02:24.859 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Error connecting to broker: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 2021-12-06 11:02:24.860 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Could not get connection to broker: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 -- Will try again in 1.578 s 2021-12-06 11:02:26.440 [pulsar-timer-5-1] INFO org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Reconnecting after connection was closed 2021-12-06 11:02:26.443 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionPool - Failed to open connection to www.stu13.com:6650 : org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 2021-12-06 11:02:26.443 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Error connecting to broker: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 2021-12-06 11:02:26.444 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Could not get connection to broker: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 -- Will try again in 2.953 s 2021-12-06 11:02:29.398 [pulsar-timer-5-1] INFO org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Reconnecting after connection was closed 2021-12-06 11:02:29.402 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionPool - Failed to open connection to www.stu13.com:6650 : org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 2021-12-06 11:02:29.403 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Error connecting to broker: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 2021-12-06 11:02:29.405 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Could not get connection to broker: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 -- Will try again in 5.794 s 2021-12-06 11:02:35.200 [pulsar-timer-5-1] INFO org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Reconnecting after connection was closed 2021-12-06 11:02:35.204 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionPool - Failed to open connection to www.stu13.com:6650 : org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 2021-12-06 11:02:35.206 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Error connecting to broker: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 2021-12-06 11:02:35.208 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Could not get connection to broker: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 -- Will try again in 12.417 s 2021-12-06 11:02:47.290 [pulsar-timer-5-1] INFO o.a.pulsar.client.impl.ProducerStatsRecorderImpl - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Pending messages: 1 --- Publish throughput: 0.20 msg/s --- 0.00 Mbit/s --- Latency: med: 10.000 ms - 95pct: 44.000 ms - 99pct: 44.000 ms - 99.9pct: 44.000 ms - max: 44.000 ms --- Ack received rate: 0.20 ack/s --- Failed messages: 0 2021-12-06 11:02:47.629 [pulsar-timer-5-1] INFO org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Reconnecting after connection was closed 2021-12-06 11:02:47.632 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionPool - Failed to open connection to www.stu13.com:6650 : org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 2021-12-06 11:02:47.633 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Error connecting to broker: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 2021-12-06 11:02:47.634 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Could not get connection to broker: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 -- Will try again in 5.465 s 2021-12-06 11:02:53.101 [pulsar-timer-5-1] INFO org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Reconnecting after connection was closed 2021-12-06 11:02:53.105 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionPool - Failed to open connection to www.stu13.com:6650 : org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 2021-12-06 11:02:53.106 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Error connecting to broker: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 2021-12-06 11:02:53.106 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Could not get connection to broker: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 -- Will try again in 50.946 s 2021-12-06 11:02:53.935 [pulsar-timer-5-1] INFO org.apache.pulsar.client.impl.ProducerImpl - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Message send timed out. Failing 1 messages 2021-12-06 11:02:53.948 [main] INFO o.s.b.a.l.ConditionEvaluationReportLoggingListener -

Error starting ApplicationContext. To display the conditions report re-run your application with 'debug' enabled. 2021-12-06 11:02:53.956 [main] ERROR org.springframework.boot.SpringApplication - Application run failed java.lang.IllegalStateException: Failed to execute ApplicationRunner at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:789) at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:776) at org.springframework.boot.SpringApplication.run(SpringApplication.java:322) at org.springframework.boot.SpringApplication.run(SpringApplication.java:1237) at org.springframework.boot.SpringApplication.run(SpringApplication.java:1226) at com.gtja.pulsardemo.pulsartest.StartApplication.main(StartApplication.java:13) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:49) at org.springframework.boot.loader.Launcher.launch(Launcher.java:109) at org.springframework.boot.loader.Launcher.launch(Launcher.java:58) at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:88) Caused by: org.apache.pulsar.client.api.PulsarClientException$TimeoutException: The producer pulsar-cluster-test-84-1 can not send message to the topic persistent://gmtp_nasdaq/US_test/metadata within given timeout : createdAt 30005395867 ns ago, firstSentAt 4668456313504950 ns ago, lastSentAt 4668456313504950 ns ago, retryCount 0 at org.apache.pulsar.client.api.PulsarClientException.unwrap(PulsarClientException.java:961) at org.apache.pulsar.client.impl.TypedMessageBuilderImpl.send(TypedMessageBuilderImpl.java:91) at org.apache.pulsar.client.impl.ProducerBase.send(ProducerBase.java:63) at com.gtja.pulsardemo.pulsartest.ProducerDemo.sendMsg(ProducerDemo.java:20) at com.gtja.pulsardemo.pulsartest.StartApplication.lambda$run$1(StartApplication.java:38) at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:786) ... 13 common frames omitted 2021-12-06 11:02:54.057 [main] INFO o.s.scheduling.concurrent.ThreadPoolTaskExecutor - Shutting down ExecutorService 'applicationTaskExecutor' 2021-12-06 11:03:44.054 [pulsar-timer-5-1] INFO org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Reconnecting after connection was closed 2021-12-06 11:03:44.057 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionPool - Failed to open connection to www.stu13.com:6650 : org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 2021-12-06 11:03:44.058 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Error connecting to broker: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 2021-12-06 11:03:44.059 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Could not get connection to broker: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 -- Will try again in 58.795 s 2021-12-06 11:03:47.292 [pulsar-timer-5-1] INFO o.a.pulsar.client.impl.ProducerStatsRecorderImpl - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Pending messages: 0 --- Publish throughput: 0.00 msg/s --- 0.00 Mbit/s --- Latency: med: 0.000 ms - 95pct: 0.000 ms - 99pct: 0.000 ms - 99.9pct: 0.000 ms - max: -∞ ms --- Ack received rate: 0.00 ack/s --- Failed messages: 1 2021-12-06 11:04:42.857 [pulsar-timer-5-1] INFO org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Reconnecting after connection was closed 2021-12-06 11:04:42.861 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionPool - Failed to open connection to www.stu13.com:6650 : org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 2021-12-06 11:04:42.861 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Error connecting to broker: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 2021-12-06 11:04:42.861 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Could not get connection to broker: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 -- Will try again in 55.527 s // 2021-12-06 11:05:38.390 [pulsar-timer-5-1] INFO org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Reconnecting after connection was closed 2021-12-06 11:05:38.405 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionPool - Failed to open connection to www.stu13.com:6650 : org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 2021-12-06 11:05:38.409 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Error connecting to broker: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 2021-12-06 11:05:38.410 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler - [persistent://gmtp_nasdaq/US_test/metadata] [pulsar-cluster-test-84-1] Could not get connection to broker: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: www.stu13.com/10.187.128.67:6650 -- Will try again in 58.002 s

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

zbye commented 2 years ago

pulsar-client使用的是java版,2.8.1; pulsar 服务端是2.8.1;

mattisonchao commented 2 years ago

i will try to reproduce it first ~

zbye commented 2 years ago

@mattisonchao 真实场景中,生产者会对接交易所并往pulsar集群中发送实时最新行情数据(时序数据),消费者从pulsar中消费并使用数据。 我们是想做pulsar集群的高可用,正常情况下,应用程序(含生产者和消费者)只使用本地集群;如果本地机房断电,应用程序能自动切换到远端集群继续运行。

我们初期是想用域名或虚IP方式做切换,希望切换pulsar集群时,应用程序不做改动。

mattisonchao commented 2 years ago

@zbye Can you be sure that all your dns servers will be updated before reconnect to broker ?

zbye commented 2 years ago

@mattisonchao yes. dns servers are also deployed and managed by us. 目前内网DNS服务器是我们自行搭建和管理的,还没引入云机机房。 (之前网上的腾讯案例是结合服务发现方式做集群的探活和自动切换,但是社区讲dns和vip也是一种方式,只是比较缺少实践案例介绍)。

hangc0276 commented 2 years ago

We are designing a proposal for auto switch cluster service provider when one cluster is failed.

zbye commented 2 years ago

thanks. In my experiment, applications using go client (pulsar go-client sdk) works well when modify dns or /etc/hosts.

github-actions[bot] commented 2 years ago

The issue had no activity for 30 days, mark with Stale label.

github-actions[bot] commented 2 years ago

The issue had no activity for 30 days, mark with Stale label.