Open lmolkova opened 1 year ago
It can also affect any scenarios when users stop processors before application ends, leaving load balancing active for unknown amount of time.
The assumption is that TracingIntegrationTests.sendBuffered
is slightly flaky because of it since someone else still owns the partition it tries to receive from
com.azure.core.amqp.exception.AmqpException: At least one receiver for the endpoint is created with epoch of '0', and so non-epoch receiver is not allowed. Either reconnect with a higher epoch, or make sure all epoch receivers are closed or disconnected. TrackingId:40473a93-e37f-4d9e-accf-bff37150963e_B28, SystemTracker:eventhubt7cce3cdb680e49e1:eventhub:javaeventhub~6553, Timestamp:2023-03-07T16:59:00 Reference:88fcfcbe-3d57-483f-b6fa-e84ddeeeb8a0, TrackingId:dac4616d-9d49-4fd1-a017-536dc39971a8_B28, SystemTracker:eventhubt7cce3cdb680e49e1:eventhub:javaeventhub~6553|$default, Timestamp:2023-03-07T16:59:00 TrackingId:30bd1b3959dc4411a5ccf669c656075e_G28, SystemTracker:gateway5, Timestamp:2023-03-07T16:59:00, errorContext[NAMESPACE: eventhubt7cce3cdb680e49e1.servicebus.windows.net. ERROR CONTEXT: N/A, PATH: javaeventhub/ConsumerGroups/$Default/Partitions/0, REFERENCE_ID: 0_576953_1678208340184, LINK_CREDIT: 0]
Integration tests that involve processors can be flaky as a result of contention between consumers owning partitions.
Investigation shown that
PartitionBasedLoadBalancer
remains active for a long time after processor that owns it is stopped. We need a way to stop load balancing after processor is stopped.The attempts was made here https://github.com/Azure/azure-sdk-for-java/pull/33600/commits/3822f028e3f38dff72f83fd7940b54cee98cc40f, but according to https://github.com/Azure/azure-sdk-for-java/pull/33600#discussion_r1124829433 it could cause memory leak.