Open tongtaodragon opened 1 week ago
The issue confirmed in production environment. We got live dump and searched the message which stucked for several days. In the msg properties, there is no CONSUME_START_TIME property. We don't know the reason why consume thread exit abnormally still.
Before Creating the Bug Report
[X] I found a bug, not just asking a question, which should be created in GitHub Discussions.
[X] I have searched the GitHub Issues and GitHub Discussions of this repository and believe that this is not a duplicate.
[X] I have confirmed that this bug belongs to the current repository, not other repositories of RocketMQ.
Runtime platform environment
Redhat 8.0
RocketMQ version
Broker: 4.8.0 Client SDK: 4.9.3
JDK Version
Orace JDK 1.8.0_121
Describe the Bug
In our product environment, we met this issue for several times but can't reproduce always. It happened during upgrading. We have several container instances in environment. During upgrade we adopt rolling upgrade. When it happened, the consumer minOffset of one queue in one broker always keeps fixed value, but the consumer still can consumer later messages successfully. And the number of cumulative messages grow bigger. After we restart the problematic consumer instance, this issue disappeared.
Steps to Reproduce
We doubt one consume thread exit abnormally and doesn't set ConsumeStartTimeStamp. Then we reproduce it using below hack method.
In ConsumeMessageConcurrentlyService.java, we change code.
Add one member as below private static int count = 0;
Add code in methos run of class ConsumeRequest as below @Override public void run() { if (this.processQueue.isDropped()) { log.info("xxxx"); return; }
// Add below code if (count == 0) { count++; return; }
Recompile rocketmq-client jar and install
Produce some messages in queue
Start one consumer instance, then we can found issue happened.
What Did You Expect to See?
After some time, this queue will be normal, minOffset keep grow normally and cumulative messages be normal. We don't want to restart consumer instance since it will consume old messages again.
What Did You See Instead?
minOffset keep a fixed value and no grow, the cumulative messages grow bigger.
Additional Context
In our case