Closed yuhaii closed 4 years ago
We recently released newer versions of the Event Hubs libraries that contain a fix for this issue. Could you please try updating the version and see if you still have this issue?
azure-messaging-eventhubs - 5.2.0
azure azure-messaging-eventhubs-checkpointstore-blob - 1.2.0
This issue is related to #13785
Thanks! Srnagar. We will try this version SDK.
Hello @srnagar, before migrating to new version can you get a confirmation on if the issue is fixed. Because switching between version consumes lot of resource bandwidth and has a significant amount of additional cost associated with testing the new version for our quality & load performance.
I would request a confirmation on the subject from relevant product team. So that we can move ahead with confidence and prevent any unwanted migration ahead.
@yuhaii we had similar issues on our setup and for now it seems resolved. So the update worked for us.
Got it. Thanks for your confirmation. Vinceve
Thanks for the confirmation @vinceve! Closing this issue.
@srnagar this night it stopped working for us. I guess the bug is still persistent.
The blue line are incoming messages. And the orange line is outgoing after a reboot.
I will send you the logs.
@vinceve, any updates on that? The same issue just occurred in one of my consumers. We are using version 5.2.0.
Happy new year, @srnagar. This issue reproduced again on 5.2.0.
We observed that the checkpointing of the partition was stuck for couple of days and it was reset by us manually. Please find the attached screenshot of the metric.
When we use old SDK, we can break the lease of that checkpoint file to mitigate the issue. But in new SDK, the checkpoint file already been un-released. We have to restart the application. This is our production application, is there any good workaround if you can't fix this issue immediately? We don't want to restart the production application each time when such issue happens.
Could you please help double checking this issue? Thanks in advance.
@yuhaii as discussed offline, please use version 5.3.1 as it contains a fix for this issue.
understand, let us try v5.3.1. thanks for your confirmation, @srnagar!
Hello @srnagar , good day. Our customer reported that they did load testing with the latest event hub sdk version, still we are facing checkpoint related issue.
compile group: 'com.azure', name: 'azure-messaging-eventhubs', version: '5.4.0' compile group: 'com.azure', name: 'azure-messaging-eventhubs-checkpointstore-blob', version: '1.4.0'
The checkpoint and ownership blobs not getting updated. PFB details for reference:
Could you please help double us double check this issue? Thank you.
@yuhaii could you please share logs when this issue happened? This is not the case when partitions stopped receiving events. In this case, the ownership is not updated which requires logs for further investigation.
we started seeing same issue with java SDK (azure-eventhubs-eph v2.1.0), is this has been addressed in eph library too? @srnagar
Describe the bug We use the below SDK to receiving message from event hub.
But one partition #3 suddenly stop receiving messages at 9/11 1:22 UTC. We can see its checkpoint didn't update.
The outgoing message would drop accordingly.
It recovered at 9/11 5:02 UTC. We can see the #3 partition checkpoint recover update at this time.
We checked the sending messages and confirmed that there were message continue sending to event hub partition #3 from 9/11 1:22 to 5:02 UTC. But we checked the log in customer code and confirmed that the partition #3 receive call back function processContext didn't been called at this time range.
__public EventProcessorClient eventProcessorClientBuilder( @Autowired CheckpointStore checkpointStore, @Autowired EventHubRecordProcessor eventHubRecordProcessor) {
}__
_public void processContext(EventBatchContext eventContext) {
}_
We tried to update SDK to following latest beta vesion. But issue still exits.
https://mvnrepository.com/artifact/com.azure/azure-messaging-eventhubs/5.2.0-beta.2 https://mvnrepository.com/artifact/com.azure/azure-messaging-eventhubs-checkpointstore-blob/1.2.0-beta.2
Exception or Stack Trace No exception. When message sending to hub. The receiver callback processContext didn't been called for that specified partition #3. The issue partition number is random. According to latest reproduce, it is on partition #3
To Reproduce Steps to reproduce the behavior: please run attached code for 2-3 days, it will reproduce.
Code Snippet I attached the code snippet for reference.
Expected behavior The call back should be called normally for all partitions
Screenshots see those screenshot in description
Setup (please complete the following information):