Open chyezh opened 5 months ago
/assign
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen
.
keep it
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen
.
keep it
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen
.
/reopen
/assign @aoiasd
Here's the bug explaination:
Rocksmq Consumer's dilivery logic works as below:
func (c *client) consume(consumer *consumer) {
...
case _, ok := <-consumer.MsgMutex():
if !ok {
// consumer MsgMutex closed, goroutine exit
log.Debug("Consumer MsgMutex closed")
return
}
c.deliver(consumer)
}
...
}
The consuming notification's signal MsgMutex
is a channel which is triggered by Producer.
That notification may be lost if MsgMutex is full.
msgMutex chan struct{}
So once the producer do not send message, the tailing message may be never consumed by consumer, because that the MsgMutex is never notified.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen
.
Is there an existing issue for this?
Environment
Current Behavior
rocksmq consumer deliver may skip the producing signal. So rocksmq consumer may get stucked if there's no more producing message.
https://github.com/milvus-io/milvus/blob/66710008d63c42633acbb785d4b9313513f07819/internal/mq/mqimpl/rocksmq/client/client_impl.go#L149
Expected Behavior
No response
Steps To Reproduce
Milvus Log
No response
Anything else?
It's work on current milvus for there's always a timetick message. But the consuming rate is limited. And current milvus use a big buffer msgchannel by default, so the problem is hard to reproduce in milvus environment.
But in future, we introduce a wal interface, the bug should be fixed. #33285