Closed hust419 closed 1 year ago
从描述看像是客户端逻辑的问题,没有选择备节点。不过基于低版本的 flink connector 的线程模型和实现存在诸多问题,我已经基于 FLIP-27/191 重构了 connector 的实现,这几天应该会提交到社区,欢迎 review 和指点。
From the description, it seems like a client-side logic issue where no backup broker was selected. However, there are many issues with the thread model and implementation of the low version Flink connector. Therefore, I have refactored the connector's implementation based on FLIP-27/191 and will be submitting it to the community in the next few days. I welcome reviews and suggestions.
Before Creating the Bug Report
[X] I found a bug, not just asking a question, which should be created in GitHub Discussions.
[X] I have searched the GitHub Issues and GitHub Discussions of this repository and believe that this is not a duplicate.
[X] I have confirmed that this bug belongs to the current repository, not other repositories of RocketMQ.
Runtime platform environment
Linux CentOS7
RocketMQ version
4.7.1
JDK Version
No response
Describe the Bug
代码片段如下:这段代码来自https://github.com/apache/rocketmq-flink/的早期版本 this.executor.execute( () -> { RetryUtil.call( () -> { while (runningChecker.isRunning()) { try { long offset = getMessageQueueOffset(mq); PullResult pullResult = consumer.pullBlockIfNotFound( mq, tag, offset, pullBatchSize);
可以看到这个地方调用了getMessageQueueOffset,最终调用了这个方法
MQClientInstance.java
这个方法里面有个重大缺陷,他只会去master节点获取信息,如果master节点宕机不存在了,他就会返回null,整个程序也就不可用了
Steps to Reproduce
部署结构是1主1从,下线broker-master节点,观察消费情况发现无法正常消费
What Did You Expect to See?
下线broker-master节点后,应该从broker-slave节点继续消费,拉取信息,而不是挂掉,不影响消费
What Did You See Instead?
下线broker节点之后,程序发生异常,无法拉取到最新offset,因此无法继续消费
Additional Context
No response