apache / rocketmq-client-go

Apache RocketMQ go client
https://rocketmq.apache.org/
Apache License 2.0
1.29k stars 415 forks source link

什么原因会导致客户端小概率停止消费某个queue的消息? #1085

Open HZL3151904214 opened 1 year ago

HZL3151904214 commented 1 year ago

集群环境,部署了三套环境共有三个broker每个broker有8个queue,测试过程中发现有一个broker的queue会一直堆积消息没法正常消费,其他的queue都可以正常消费,看客户端控制台也没有warn与error日志,是什么原因?

把客户端重启后可以正常消费,但是具体是什么原因导致的某个queue堆积不消费没分析出来,并没有大量消息生产,只是少量的消息,主要还是小概率问题。。

consumer的配置如下: c, err := rocketmq.NewPushConsumer( consumer.WithGroupName(config.YMQConfig.Consumer.ConsumerGroupName), consumer.WithConsumerOrder(true), consumer.WithNameServer(strings.Split(config.YMQConfig.ServerAddress, ",")), consumer.WithInstance(generateInstanceName()), consumer.WithCredentials(primitive.Credentials{ AccessKey: config.YMQConfig.AccessKey, SecretKey: config.YMQConfig.SecretKey, }), consumer.WithConsumeFromWhere(consumer.ConsumeFromLastOffset), consumer.WithConsumerModel(consumer.Clustering), consumer.WithRebalanceLockInterval(1*time.Second), )

francisoliverlee commented 1 year ago

it's hard to check without any error logs, or running evn

HZL3151904214 commented 1 year ago

2023-07-31T09:13:52.19左右出现过mq服务器重启,mq重启之后,就出现了 brokerName=rocketmq-broker-b, queueId=1的消息堆积,直到2023-07-31T13:13:52.195Z才恢复,offset从529直接跳到了553

恢复正常时候的mq日志如下: 2023-07-31T13:13:52.195Z WARN mq/producer.go:100 fetch offset of mq from broker success {"consumerGroup": "op_controller_cluster_ph", "MessageQueue": "MessageQueue [topic=ydocs_op_uplink_porder, brokerName=rocketmq-broker-b, queueId=1]", "offset": 529}

2023-07-31T13:13:52.195Z DEBUG mq/producer.go:66 do defaultConsumer, add a new mq {"consumerGroup": "op_controller_cluster_ph", "MessageQueue": "MessageQueue [topic=ydocs_op_uplink_porder, brokerName=rocketmq-broker-b, queueId=1]"}

2023-07-31T13:13:52.195Z DEBUG mq/producer.go:66 pull MessageQueue: 1 sleep 3000 ms for mq: MessageQueue [topic=ydocs_op_uplink_porder, brokerName=rocketmq-broker-b, queueId=1]

2023-07-31T13:13:52.297Z DEBUG mq/producer.go:66 lock MessageQueue {"lockOK": true, "consumerGroup": "op_controller_cluster_ph", "MessageQueue": "MessageQueue [topic=ydocs_op_uplink_porder, brokerName=rocketmq-broker-b, queueId=1]"}

2023-07-31T13:13:55.196Z WARN mq/producer.go:100 fetch offset of mq from broker success {"MessageQueue": "MessageQueue [topic=ydocs_op_uplink_porder, brokerName=rocketmq-broker-b, queueId=1]", "offset": 529, "consumerGroup": "op_controller_cluster_ph"}

2023-07-31T13:13:57.187Z INFO mq/producer.go:83 update offset to broker success {"consumerGroup": "op_controller_cluster_ph", "MessageQueue": "MessageQueue [topic=ydocs_op_uplink_porder, brokerName=rocketmq-broker-b, queueId=1]", "offset": 553}

redlsz commented 1 year ago

2023-07-31T09:13:52.19左右出现过mq服务器重启,mq重启之后,就出现了 brokerName=rocketmq-broker-b, queueId=1的消息堆积,直到2023-07-31T13:13:52.195Z才恢复,offset从529直接跳到了553

恢复正常时候的mq日志如下: 2023-07-31T13:13:52.195Z WARN mq/producer.go:100 fetch offset of mq from broker success {"consumerGroup": "op_controller_cluster_ph", "MessageQueue": "MessageQueue [topic=ydocs_op_uplink_porder, brokerName=rocketmq-broker-b, queueId=1]", "offset": 529}

2023-07-31T13:13:52.195Z DEBUG mq/producer.go:66 do defaultConsumer, add a new mq {"consumerGroup": "op_controller_cluster_ph", "MessageQueue": "MessageQueue [topic=ydocs_op_uplink_porder, brokerName=rocketmq-broker-b, queueId=1]"}

2023-07-31T13:13:52.195Z DEBUG mq/producer.go:66 pull MessageQueue: 1 sleep 3000 ms for mq: MessageQueue [topic=ydocs_op_uplink_porder, brokerName=rocketmq-broker-b, queueId=1]

2023-07-31T13:13:52.297Z DEBUG mq/producer.go:66 lock MessageQueue {"lockOK": true, "consumerGroup": "op_controller_cluster_ph", "MessageQueue": "MessageQueue [topic=ydocs_op_uplink_porder, brokerName=rocketmq-broker-b, queueId=1]"}

2023-07-31T13:13:55.196Z WARN mq/producer.go:100 fetch offset of mq from broker success {"MessageQueue": "MessageQueue [topic=ydocs_op_uplink_porder, brokerName=rocketmq-broker-b, queueId=1]", "offset": 529, "consumerGroup": "op_controller_cluster_ph"}

2023-07-31T13:13:57.187Z INFO mq/producer.go:83 update offset to broker success {"consumerGroup": "op_controller_cluster_ph", "MessageQueue": "MessageQueue [topic=ydocs_op_uplink_porder, brokerName=rocketmq-broker-b, queueId=1]", "offset": 553}

1084 这个 pr 修复了一个路由更新的 bug, broker重启的时候可能会触发,可以升级到最新版本再看看

zebrafirst commented 1 year ago

consumer对象和namesvr对象的路由信息不一致会导致消息不消费