Azure / azure-service-bus-java

☁️ Java client library for Azure Service Bus
https://azure.microsoft.com/services/service-bus
MIT License
60 stars 59 forks source link

QueueClient stops receiving messages and shuts down #351

Closed ntherrien closed 5 years ago

ntherrien commented 5 years ago

Focus on exceptions No exceptions are raised.

Program is operating normally for no more than 50 messages and then shows:

18:24:48.355 [Finalizer] INFO com.microsoft.azure.servicebus.MessageAndSessionPump - Closing message and session pump on entity 'sip_out' 18:24:48.356 [Finalizer] INFO com.microsoft.azure.servicebus.MessageReceiver - Closing MessageReceiver to entity 'sip_out' 18:24:48.357 [Finalizer] DEBUG com.microsoft.azure.servicebus.primitives.CoreMessageReceiver - Cancelled SAS Token renew timer 18:24:48.358 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] INFO com.microsoft.azure.servicebus.primitives.CoreMessageReceiver - Closing receive link to 'sip_out' 18:24:48.359 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.amqp.BaseLinkHandler - local link close. linkName:Receiver_b39bf3_54bb7ff856614cb78f9240ef797cd8fc_G28 18:24:48.361 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.amqp.SessionHandler - onSessionLocalClose - entityName: sip_out, condition: Error{condition=null, description='null', info=null} 18:24:48.361 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.amqp.ReceiveLinkHandler - onDelivery: linkName:Receiver_b39bf3_54bb7ff856614cb78f9240ef797cd8fc_G28, updatedLinkCredit:1, remoteCredit:0, remoteCondition:Error{condition=null, description='null', info=null}, delivery.isPartial:false 18:24:48.361 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.primitives.CoreMessageReceiver - Received a delivery ')j��@A��#bw��' from 'sip_out' 18:24:48.361 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.primitives.CoreMessageReceiver - Received a message from 'sip_out'. Adding to prefecthed messages. 18:24:48.391 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.amqp.BaseLinkHandler - link remote close. linkName:Receiver_b39bf3_54bb7ff856614cb78f9240ef797cd8fc_G28 18:24:48.391 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.amqp.BaseLinkHandler - linkName:Receiver_b39bf3_54bb7ff856614cb78f9240ef797cd8fc_G28, ErrorCondition:null, null 18:24:48.396 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] INFO com.microsoft.azure.servicebus.primitives.CoreMessageReceiver - Receive link to 'sip_out', sessionId 'null' closed 18:24:48.396 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] INFO com.microsoft.azure.servicebus.primitives.CoreMessageReceiver - Completeing all pending receive and updateState operation on the receiver to 'sip_out' 18:24:48.396 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.amqp.SessionHandler - onSessionRemoteClose - entityName: sip_out, condition: Error{condition=null, description='null', info=null} 18:24:48.396 [pool-1-thread-3] INFO com.microsoft.azure.servicebus.MessageReceiver - Closed MessageReceiver to entity 'sip_out' 18:24:48.397 [pool-1-thread-4] DEBUG com.microsoft.azure.servicebus.MessageAndSessionPump - Receive from entity 'sip_out' returned no messages. 18:24:48.398 [pool-1-thread-4] INFO com.microsoft.azure.servicebus.primitives.MessagingFactory - Closing messaging factory 18:24:48.398 [pool-1-thread-4] INFO com.microsoft.azure.servicebus.primitives.MessagingFactory - Closing CBS link 18:24:48.398 [pool-1-thread-4] INFO com.microsoft.azure.servicebus.primitives.RequestResponseLink - Closing requestresponselink to $cbs by closing both internal sender and receiver links. 18:24:48.399 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.primitives.RequestResponseLink - Closing internal send link of requestresponselink to $cbs 18:24:48.400 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.amqp.BaseLinkHandler - local link close. linkName:RequestResponseLink-Sender_fb8f0b_54bb7ff856614cb78f9240ef797cd8fc_G28 18:24:48.401 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.amqp.SessionHandler - onSessionLocalClose - entityName: $cbs, condition: Error{condition=null, description='null', info=null} 18:24:48.429 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.amqp.BaseLinkHandler - link remote close. linkName:RequestResponseLink-Sender_fb8f0b_54bb7ff856614cb78f9240ef797cd8fc_G28 18:24:48.429 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.amqp.BaseLinkHandler - linkName:RequestResponseLink-Sender_fb8f0b_54bb7ff856614cb78f9240ef797cd8fc_G28, ErrorCondition:null, null 18:24:48.429 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] INFO com.microsoft.azure.servicebus.primitives.RequestResponseLink - Closed internal send link of requestresponselink to $cbs 18:24:48.429 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.amqp.SessionHandler - onSessionRemoteClose - entityName: $cbs, condition: Error{condition=null, description='null', info=null} 18:24:48.431 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.primitives.RequestResponseLink - Closing internal receive link of requestresponselink to $cbs 18:24:48.431 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.amqp.BaseLinkHandler - local link close. linkName:RequestResponseLink-Receiver_6d7c23_54bb7ff856614cb78f9240ef797cd8fc_G28 18:24:48.432 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.amqp.SessionHandler - onSessionLocalClose - entityName: $cbs, condition: Error{condition=null, description='null', info=null} 18:24:48.461 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.amqp.BaseLinkHandler - link remote close. linkName:RequestResponseLink-Receiver_6d7c23_54bb7ff856614cb78f9240ef797cd8fc_G28 18:24:48.461 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.amqp.BaseLinkHandler - linkName:RequestResponseLink-Receiver_6d7c23_54bb7ff856614cb78f9240ef797cd8fc_G28, ErrorCondition:null, null 18:24:48.461 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] INFO com.microsoft.azure.servicebus.primitives.RequestResponseLink - Closed internal receive link of requestresponselink to $cbs 18:24:48.461 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.amqp.SessionHandler - onSessionRemoteClose - entityName: $cbs, condition: Error{condition=null, description='null', info=null} 18:24:48.461 [pool-1-thread-4] INFO com.microsoft.azure.servicebus.primitives.RequestResponseLinkcache - Closing all cached request-response links 18:24:48.463 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] INFO com.microsoft.azure.servicebus.primitives.MessagingFactory - Closing connection to host 18:24:48.463 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.amqp.ConnectionHandler - onConnectionLocalClose: hostname:experimental-sipk8s-bus-01.servicebus.windows.net:5671 18:24:48.492 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.amqp.ConnectionHandler - onConnectionRemoteClose: hostname:experimental-sipk8s-bus-01.servicebus.windows.net:5671,errorCondition:null,null 18:24:48.493 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.primitives.MessagingFactory - Closing all links on the connection. Number of links '0' 18:24:48.493 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.primitives.MessagingFactory - Closed all links on the connection. Number of links '0' 18:24:48.493 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] INFO com.microsoft.azure.servicebus.primitives.MessagingFactory - Connection to host closed. 18:24:48.494 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.amqp.SessionHandler - onSessionFinal - entityName: $cbs 18:24:48.494 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.amqp.SessionHandler - onSessionFinal - entityName: $cbs 18:24:48.494 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.amqp.SessionHandler - onSessionFinal - entityName: sip_out 18:24:48.494 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] DEBUG com.microsoft.azure.servicebus.amqp.ConnectionHandler - onConnectionFinal: hostname:experimental-sipk8s-bus-01.servicebus.windows.net:5671 18:24:48.514 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] INFO com.microsoft.azure.servicebus.primitives.MessagingFactory - Gracefully releasing reactor thread as messaging factory is closed 18:24:48.514 [ReactorThread2ae15400-453d-41d8-a798-2a29ade73f01] INFO com.microsoft.azure.servicebus.primitives.MessagingFactory - Stopping reactor 18:24:48.715 [ForkJoinPool-1-worker-25] INFO com.***mycompany***.IncomingMessagesApplicationCallback -

Describe the bug The problem is that we stop receiving messages for no apparent reason. There is no exception.

To Reproduce We are using the code like this:

`final QueueClient sipOutQueue = new QueueClient(new ConnectionStringBuilder(sipOutQueueConnectionString), ReceiveMode.PEEKLOCK);

on which we register a IMessageReceiver: @Override public CompletableFuture onMessageAsync(IMessage message) {

        if (message.getLabel() != null &&
                message.getContentType() != null &&
                message.getLabel().contentEquals("Message") &&
                message.getContentType().contentEquals("application/text")) {

            byte[] data = message.getBody();

// do...

    return CompletableFuture.completedFuture(null);
}

`

We also get Exceptions at different times which indicate the connection was closed on timeout (60s). Those exceptions dont seem to correlate well with the logs provided above.

Expected behavior I don't understand why the queueclient would stop receiving messages and close the link?

Environment (please complete the following information):

ntherrien commented 5 years ago

"onConnectionRemoteClose" in previous logs seems to be a clue.

I got an exception today while doing more tests. Again it seems to be about remote link closed.

Exception Phase is "COMPLETE" which from the source code appears to be when an exception occurs during the completion of a message. In this case, it would be because of a remote connection closed.

10:42:50.836 [pool-1-thread-4] DEBUG com.microsoft.azure.servicebus.MessageAndSessionPump - Completing message with sequence number '15111' 10:42:50.836 [pool-1-thread-4] DEBUG com.microsoft.azure.servicebus.MessageReceiver - Completing message with lock token '26fa2e11-8e36-475e-89e0-943260731f0b' 10:42:50.836 [pool-1-thread-4] DEBUG com.microsoft.azure.servicebus.primitives.CoreMessageReceiver - Updating message state of delivery '.�&6�^G���2s ' to 'Accepted{}' 10:42:50.837 [Finalizer] INFO com.microsoft.azure.servicebus.MessageAndSessionPump - Closing message and session pump on entity 'sip_out' 10:42:50.837 [Finalizer] INFO com.microsoft.azure.servicebus.MessageReceiver - Closing MessageReceiver to entity 'sip_out' 10:42:50.840 [ReactorThread79b798c9-8a01-48ea-aec9-520db7fe7f14] INFO com.microsoft.azure.servicebus.primitives.CoreMessageReceiver - Closing receive link to 'sip_out' 10:42:50.842 [ReactorThread79b798c9-8a01-48ea-aec9-520db7fe7f14] DEBUG com.microsoft.azure.servicebus.amqp.BaseLinkHandler - local link close. linkName:Receiver_69fcee_a608208b22f34345a87e9ec937470c47_G7 10:42:50.843 [ReactorThread79b798c9-8a01-48ea-aec9-520db7fe7f14] DEBUG com.microsoft.azure.servicebus.amqp.SessionHandler - onSessionLocalClose - entityName: sip_out, condition: Error{condition=null, description='null', info=null} 10:42:50.845 [Finalizer] DEBUG com.microsoft.azure.servicebus.primitives.CoreMessageReceiver - Cancelled SAS Token renew timer 10:42:50.866 [ReactorThread79b798c9-8a01-48ea-aec9-520db7fe7f14] DEBUG com.microsoft.azure.servicebus.amqp.BaseLinkHandler - link remote close. linkName:Receiver_69fcee_a608208b22f34345a87e9ec937470c47_G7 10:42:50.866 [ReactorThread79b798c9-8a01-48ea-aec9-520db7fe7f14] DEBUG com.microsoft.azure.servicebus.amqp.BaseLinkHandler - linkName:Receiver_69fcee_a608208b22f34345a87e9ec937470c47_G7, ErrorCondition:null, null 10:42:50.870 [ReactorThread79b798c9-8a01-48ea-aec9-520db7fe7f14] INFO com.microsoft.azure.servicebus.primitives.CoreMessageReceiver - Receive link to 'sip_out', sessionId 'null' closed 10:42:50.870 [ReactorThread79b798c9-8a01-48ea-aec9-520db7fe7f14] INFO com.microsoft.azure.servicebus.primitives.CoreMessageReceiver - Completeing all pending receive and updateState operation on the receiver to 'sip_out' 10:42:50.870 [pool-1-thread-1] INFO com.microsoft.azure.servicebus.MessageReceiver - Closed MessageReceiver to entity 'sip_out' 10:42:50.870 [ReactorThread79b798c9-8a01-48ea-aec9-520db7fe7f14] DEBUG com.microsoft.azure.servicebus.amqp.SessionHandler - onSessionRemoteClose - entityName: sip_out, condition: Error{condition=null, description='null', info=null} 10:42:50.871 [pool-1-thread-1] INFO com.microsoft.azure.servicebus.primitives.MessagingFactory - Closing messaging factory 10:42:50.871 [pool-1-thread-1] INFO com.microsoft.azure.servicebus.primitives.MessagingFactory - Closing CBS link 10:42:50.871 [pool-1-thread-1] INFO com.microsoft.azure.servicebus.primitives.RequestResponseLink - Closing requestresponselink to $cbs by closing both internal sender and receiver links. 10:42:50.875 [ReactorThread79b798c9-8a01-48ea-aec9-520db7fe7f14] DEBUG com.microsoft.azure.servicebus.primitives.RequestResponseLink - Closing internal send link of requestresponselink to $cbs 10:42:50.875 [ReactorThread79b798c9-8a01-48ea-aec9-520db7fe7f14] DEBUG com.microsoft.azure.servicebus.amqp.BaseLinkHandler - local link close. linkName:RequestResponseLink-Sender_ef122b_a608208b22f34345a87e9ec937470c47_G7 10:42:50.875 [ReactorThread79b798c9-8a01-48ea-aec9-520db7fe7f14] DEBUG com.microsoft.azure.servicebus.amqp.SessionHandler - onSessionLocalClose - entityName: $cbs, condition: Error{condition=null, description='null', info=null} 10:42:50.876 [pool-1-thread-4] ERROR com.microsoft.azure.servicebus.MessageAndSessionPump - Completing message with sequence number '15111' failed com.microsoft.azure.servicebus.primitives.ServiceBusException: Error{condition=null, description='null', info=null}, errorContext[NS: experimental-sipk8s-bus-01.servicebus.windows.net, PATH: sip_out, REFERENCE_ID: Receiver_69fcee_a608208b22f34345a87e9ec937470c47_G7, PREFETCH_COUNT: 0, LINK_CREDIT: 0, PREFETCH_Q_LEN: 0] at azure.servicebus@1.2.12/com.microsoft.azure.servicebus.primitives.ExceptionUtil.toException(ExceptionUtil.java:121) at azure.servicebus@1.2.12/com.microsoft.azure.servicebus.primitives.CoreMessageReceiver.onClose(CoreMessageReceiver.java:954) at azure.servicebus@1.2.12/com.microsoft.azure.servicebus.amqp.BaseLinkHandler.processOnClose(BaseLinkHandler.java:68) at azure.servicebus@1.2.12/com.microsoft.azure.servicebus.amqp.BaseLinkHandler.onLinkRemoteClose(BaseLinkHandler.java:42) at proton.j@0.31.0/org.apache.qpid.proton.engine.BaseHandler.handle(BaseHandler.java:176) at proton.j@0.31.0/org.apache.qpid.proton.engine.impl.EventImpl.dispatch(EventImpl.java:108) at proton.j@0.31.0/org.apache.qpid.proton.reactor.impl.ReactorImpl.dispatch(ReactorImpl.java:324) at proton.j@0.31.0/org.apache.qpid.proton.reactor.impl.ReactorImpl.process(ReactorImpl.java:291) at azure.servicebus@1.2.12/com.microsoft.azure.servicebus.primitives.MessagingFactory$RunReactor.run(MessagingFactory.java:483) at java.base/java.lang.Thread.run(Thread.java:834) 10:42:50.898 [pool-3-thread-1] ERROR com.motorolasolutions.cc911.experimental.sipk8s.sipdestack.siplisteningpoint.OutgoingMessagesApplicationCallback - COMPLETE-Error{condition=null, description='null', info=null}, errorContext[NS: experimental-sipk8s-bus-01.servicebus.windows.net, PATH: sip_out, REFERENCE_ID: Receiver_69fcee_a608208b22f34345a87e9ec937470c47_G7, PREFETCH_COUNT: 0, LINK_CREDIT: 0, PREFETCH_Q_LEN: 0] com.microsoft.azure.servicebus.primitives.ServiceBusException: Error{condition=null, description='null', info=null}, errorContext[NS: experimental-sipk8s-bus-01.servicebus.windows.net, PATH: sip_out, REFERENCE_ID: Receiver_69fcee_a608208b22f34345a87e9ec937470c47_G7, PREFETCH_COUNT: 0, LINK_CREDIT: 0, PREFETCH_Q_LEN: 0] at azure.servicebus@1.2.12/com.microsoft.azure.servicebus.primitives.ExceptionUtil.toException(ExceptionUtil.java:121) at azure.servicebus@1.2.12/com.microsoft.azure.servicebus.primitives.CoreMessageReceiver.onClose(CoreMessageReceiver.java:954) at azure.servicebus@1.2.12/com.microsoft.azure.servicebus.amqp.BaseLinkHandler.processOnClose(BaseLinkHandler.java:68) at azure.servicebus@1.2.12/com.microsoft.azure.servicebus.amqp.BaseLinkHandler.onLinkRemoteClose(BaseLinkHandler.java:42) at proton.j@0.31.0/org.apache.qpid.proton.engine.BaseHandler.handle(BaseHandler.java:176) at proton.j@0.31.0/org.apache.qpid.proton.engine.impl.EventImpl.dispatch(EventImpl.java:108) at proton.j@0.31.0/org.apache.qpid.proton.reactor.impl.ReactorImpl.dispatch(ReactorImpl.java:324) at proton.j@0.31.0/org.apache.qpid.proton.reactor.impl.ReactorImpl.process(ReactorImpl.java:291) at azure.servicebus@1.2.12/com.microsoft.azure.servicebus.primitives.MessagingFactory$RunReactor.run(MessagingFactory.java:483) at java.base/java.lang.Thread.run(Thread.java:834) 10:42:50.898 [ReactorThread79b798c9-8a01-48ea-aec9-520db7fe7f14] DEBUG com.microsoft.azure.servicebus.amqp.BaseLinkHandler - link remote close. linkName:RequestResponseLink-Sender_ef122b_a608208b22f34345a87e9ec937470c47_G7 10:42:50.898 [ReactorThread79b798c9-8a01-48ea-aec9-520db7fe7f14] DEBUG com.microsoft.azure.servicebus.amqp.BaseLinkHandler - linkName:RequestResponseLink-Sender_ef122b_a608208b22f34345a87e9ec937470c47_G7, ErrorCondition:null, null 10:42:50.898 [ReactorThread79b798c9-8a01-48ea-aec9-520db7fe7f14] INFO com.microsoft.azure.servicebus.primitives.RequestResponseLink - Closed internal send link of requestresponselink to $cbs 10:42:50.898 [ReactorThread79b798c9-8a01-48ea-aec9-520db7fe7f14] DEBUG com.microsoft.azure.servicebus.amqp.SessionHandler - onSessionRemoteClose - entityName: $cbs, condition: Error{condition=null, description='null', info=null} 10:42:50.901 [ReactorThread79b798c9-8a01-48ea-aec9-520db7fe7f14] DEBUG com.microsoft.azure.servicebus.primitives.RequestResponseLink - Closing internal receive link of requestresponselink to $cbs 10:42:50.901 [ReactorThread79b798c9-8a01-48ea-aec9-520db7fe7f14] DEBUG com.microsoft.azure.servicebus.amqp.BaseLinkHandler - local link close. linkName:RequestResponseLink-Receiver_989530_a608208b22f34345a87e9ec937470c47_G7 10:42:50.901 [ReactorThread79b798c9-8a01-48ea-aec9-520db7fe7f14] DEBUG com.microsoft.azure.servicebus.amqp.SessionHandler - onSessionLocalClose - entityName: $cbs, condition: Error{condition=null, description='null', info=null} Disconnected from the target VM, address: '127.0.0.1:63215', transport: 'socket'`

yvgopal commented 5 years ago

Closing message and session pump on entity 'sip_out' - That trace is written only when your code is closing the QueueClient. Your code is calling QueueClient at some point. Please go and check it. Or write a sample console app and see what happens. I am closingthe ticket. Please reopen if you have more info.

This is standard functionality of the SDK and it is working for all customers. Even if Azure Service Bus closes a link or connection, the SDK reopens it automatically.

ntherrien commented 5 years ago

Hi,

After reading your answer and testing further, I can confirm with 100% certainty that the program is not requesting the queueclient to close in any way.

I have done like you suggested and written a sample console app using only the 'sip_out' queue. When working with only that queue, I could not reproduce the problem. However, when I add a second 'sip_in' queue in the same program, problem starts appearing again.

Our program has two QueueClient, one for incoming messages (called 'sip_out') and one four outgoing messages (called 'sip_in'). (note: don't worry about the names of the queues, the program is adapting another protocol to messages, meaning that incoming packets are parsed and written to the sip_in message queue for other programs to read them)

At first I thought it could be because QueueClient wasn't thread safe so I made everything run sequentially on a single thread. Problem still occurs.

I have also removed all logic from the program and the problem still occurs. So basically the message receiver does not do anything with any incoming messages aside from printing traces to console.

I triple checked that our program isn't shutting down or calling the close() method on the clients. There's absolutely no interaction with the queueclient classes aside from calling "send()" on the outgoing messages queue and a message receiver on the incoming queue (called 'sip_out').

I managed to prepare a simplified logs of what's going on. Remember, these are only traces in the program, no work is actually being performed outside of queues: message received from sipout parsing sending to channel sent to channel 227677 [Finalizer] INFO c.m.a.s.MessageAndSessionPump - Closing message and session pump on entity 'sip_out' 227678 [Finalizer] INFO c.m.a.servicebus.MessageReceiver - Closing MessageReceiver to entity 'sip_out' 227684 [ReactorThread631394f8-9550-4ea4-bd5a-0dd97878af37] INFO c.m.a.s.p.CoreMessageReceiver - Closing receive link to 'sip_out' 227715 [ReactorThread631394f8-9550-4ea4-bd5a-0dd97878af37] INFO c.m.a.s.p.CoreMessageReceiver - Receive link to 'sip_out', sessionId 'null' closed 227715 [ReactorThread631394f8-9550-4ea4-bd5a-0dd97878af37] INFO c.m.a.s.p.CoreMessageReceiver - Completeing all pending receive and updateState operation on the receiver to 'sip_out' 227717 [pool-1-thread-4] INFO c.m.a.servicebus.MessageReceiver - Closed MessageReceiver to entity 'sip_out' 227719 [pool-1-thread-4] INFO c.m.a.s.primitives.MessagingFactory - Closing messaging factory 227719 [pool-1-thread-4] INFO c.m.a.s.primitives.MessagingFactory - Closing CBS link 227719 [pool-1-thread-4] INFO c.m.a.s.p.RequestResponseLink - Closing requestresponselink to $cbs by closing both internal sender and receiver links. 227744 [ReactorThread631394f8-9550-4ea4-bd5a-0dd97878af37] INFO c.m.a.s.p.RequestResponseLink - Closed internal send link of requestresponselink to $cbs 227769 [ReactorThread631394f8-9550-4ea4-bd5a-0dd97878af37] INFO c.m.a.s.p.RequestResponseLink - Closed internal receive link of requestresponselink to $cbs 227769 [pool-1-thread-4] INFO c.m.a.s.p.RequestResponseLinkcache - Closing all cached request-response links 227771 [ReactorThread631394f8-9550-4ea4-bd5a-0dd97878af37] INFO c.m.a.s.primitives.MessagingFactory - Closing connection to host 227794 [ReactorThread631394f8-9550-4ea4-bd5a-0dd97878af37] INFO c.m.a.s.primitives.MessagingFactory - Connection to host closed. 227815 [ReactorThread631394f8-9550-4ea4-bd5a-0dd97878af37] INFO c.m.a.s.primitives.MessagingFactory - Gracefully releasing reactor thread as messaging factory is closed 227815 [ReactorThread631394f8-9550-4ea4-bd5a-0dd97878af37] INFO c.m.a.s.primitives.MessagingFactory - Stopping reactor

As you can see, a message is received from 'sip_out', then the program prints out some traces to standard out without actually doing anything with the message and then sip_out dies. just like that.

As to when this happen, I haven't found a reliable pattern yet. But it always happen eventually. Usually after processing 50 messages or so (about 60 seconds).

I found another issue similar to this: https://github.com/Azure/azure-service-bus-java/issues/219

and at the end of that thread, the microsoft recommended solution is to write code to reconnect after a connection is lost.... is this really microsoft's position on the subject?

ntherrien commented 5 years ago

By the way, why did you close this issue. problem is not resolved?

I don't see a way of reopening it.

ntherrien commented 5 years ago

Here's another log exerpt. It shows how the 'sip_out' message receiver always shuts down after a message was sent to 'sip_in'. Could be coincidence.

sip request wrapping sending to queue 'sip_in' 181439 [pool-1-thread-1] DEBUG c.m.a.s.p.CoreMessageSender - Sending message to 'sip_in' 181440 [ReactorThread7e9bbbd2-c8c4-400d-ad84-8aa0a4159253] DEBUG c.m.a.s.p.CoreMessageSender - Processing pending sends to 'sip_in'. Available link credit '930' 181440 [ReactorThread7e9bbbd2-c8c4-400d-ad84-8aa0a4159253] DEBUG c.m.a.s.p.CoreMessageSender - Sending message delivery '08d8d1497d354f3bba94ee0295f2e038' to 'sip_in' 181440 [ReactorThread7e9bbbd2-c8c4-400d-ad84-8aa0a4159253] DEBUG c.m.a.s.p.CoreMessageSender - There are no pending sends to 'sip_in'. 181489 [ReactorThread7e9bbbd2-c8c4-400d-ad84-8aa0a4159253] DEBUG c.m.a.s.amqp.SendLinkHandler - onDelivery: linkName:a13d5d_5dbffa39d3f74580847c01f58d33b964_G25, unsettled:1, credit:929, deliveryState:Accepted{}, delivery.isBuffered:false, delivery.tag:[48, 56, 100, 56, 100, 49, 52, 57, 55, 100, 51, 53, 52, 102, 51, 98, 98, 97, 57, 52, 101, 101, 48, 50, 57, 53, 102, 50, 101, 48, 51, 56] 181489 [ReactorThread7e9bbbd2-c8c4-400d-ad84-8aa0a4159253] DEBUG c.m.a.s.p.CoreMessageSender - Received ack for delivery. path:sip_in, linkName:a13d5d_5dbffa39d3f74580847c01f58d33b964_G25, deliveryTag:08d8d1497d354f3bba94ee0295f2e038, outcome:Accepted{} sent to 'sip_in' 188194 [Finalizer] INFO c.m.a.s.MessageAndSessionPump - Closing message and session pump on entity 'sip_out' 188194 [Finalizer] INFO c.m.a.servicebus.MessageReceiver - Closing MessageReceiver to entity 'sip_out' 188196 [Finalizer] DEBUG c.m.a.s.p.CoreMessageReceiver - Cancelled SAS Token renew timer 188196 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] INFO c.m.a.s.p.CoreMessageReceiver - Closing receive link to 'sip_out' 188196 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] DEBUG c.m.a.s.amqp.BaseLinkHandler - local link close. linkName:Receiver_1bcf0a_9a5136f4185c4368b0b779bb6654aac8_G5 188198 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] DEBUG c.m.a.s.amqp.SessionHandler - onSessionLocalClose - entityName: sip_out, condition: Error{condition=null, description='null', info=null} 188222 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] DEBUG c.m.a.s.amqp.BaseLinkHandler - link remote close. linkName:Receiver_1bcf0a_9a5136f4185c4368b0b779bb6654aac8_G5 188222 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] DEBUG c.m.a.s.amqp.BaseLinkHandler - linkName:Receiver_1bcf0a_9a5136f4185c4368b0b779bb6654aac8_G5, ErrorCondition:null, null 188227 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] INFO c.m.a.s.p.CoreMessageReceiver - Receive link to 'sip_out', sessionId 'null' closed 188227 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] INFO c.m.a.s.p.CoreMessageReceiver - Completeing all pending receive and updateState operation on the receiver to 'sip_out' 188228 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] DEBUG c.m.a.s.amqp.SessionHandler - onSessionRemoteClose - entityName: sip_out, condition: Error{condition=null, description='null', info=null} 188228 [pool-1-thread-4] INFO c.m.a.servicebus.MessageReceiver - Closed MessageReceiver to entity 'sip_out' 188228 [pool-1-thread-5] DEBUG c.m.a.s.MessageAndSessionPump - Receive from entity 'sip_out' returned no messages. 188231 [pool-1-thread-5] INFO c.m.a.s.primitives.MessagingFactory - Closing messaging factory 188231 [pool-1-thread-5] INFO c.m.a.s.primitives.MessagingFactory - Closing CBS link 188231 [pool-1-thread-5] INFO c.m.a.s.p.RequestResponseLink - Closing requestresponselink to $cbs by closing both internal sender and receiver links. 188232 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] DEBUG c.m.a.s.p.RequestResponseLink - Closing internal send link of requestresponselink to $cbs 188233 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] DEBUG c.m.a.s.amqp.BaseLinkHandler - local link close. linkName:RequestResponseLink-Sender_9e6105_9a5136f4185c4368b0b779bb6654aac8_G5 188234 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] DEBUG c.m.a.s.amqp.SessionHandler - onSessionLocalClose - entityName: $cbs, condition: Error{condition=null, description='null', info=null} 188257 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] DEBUG c.m.a.s.amqp.BaseLinkHandler - link remote close. linkName:RequestResponseLink-Sender_9e6105_9a5136f4185c4368b0b779bb6654aac8_G5 188257 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] DEBUG c.m.a.s.amqp.BaseLinkHandler - linkName:RequestResponseLink-Sender_9e6105_9a5136f4185c4368b0b779bb6654aac8_G5, ErrorCondition:null, null 188257 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] INFO c.m.a.s.p.RequestResponseLink - Closed internal send link of requestresponselink to $cbs 188257 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] DEBUG c.m.a.s.amqp.SessionHandler - onSessionRemoteClose - entityName: $cbs, condition: Error{condition=null, description='null', info=null} 188259 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] DEBUG c.m.a.s.p.RequestResponseLink - Closing internal receive link of requestresponselink to $cbs 188259 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] DEBUG c.m.a.s.amqp.BaseLinkHandler - local link close. linkName:RequestResponseLink-Receiver_2eef42_9a5136f4185c4368b0b779bb6654aac8_G5 188259 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] DEBUG c.m.a.s.amqp.SessionHandler - onSessionLocalClose - entityName: $cbs, condition: Error{condition=null, description='null', info=null} 188283 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] DEBUG c.m.a.s.amqp.BaseLinkHandler - link remote close. linkName:RequestResponseLink-Receiver_2eef42_9a5136f4185c4368b0b779bb6654aac8_G5 188283 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] DEBUG c.m.a.s.amqp.BaseLinkHandler - linkName:RequestResponseLink-Receiver_2eef42_9a5136f4185c4368b0b779bb6654aac8_G5, ErrorCondition:null, null 188283 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] INFO c.m.a.s.p.RequestResponseLink - Closed internal receive link of requestresponselink to $cbs 188283 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] DEBUG c.m.a.s.amqp.SessionHandler - onSessionRemoteClose - entityName: $cbs, condition: Error{condition=null, description='null', info=null} 188283 [pool-1-thread-5] INFO c.m.a.s.p.RequestResponseLinkcache - Closing all cached request-response links 188285 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] INFO c.m.a.s.primitives.MessagingFactory - Closing connection to host 188285 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] DEBUG c.m.a.s.amqp.ConnectionHandler - onConnectionLocalClose: hostname:experimental-sipk8s-bus-01.servicebus.windows.net:5671 188310 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] DEBUG c.m.a.s.amqp.ConnectionHandler - onConnectionRemoteClose: hostname:experimental-sipk8s-bus-01.servicebus.windows.net:5671,errorCondition:null,null 188311 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] DEBUG c.m.a.s.primitives.MessagingFactory - Closing all links on the connection. Number of links '0' 188311 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] DEBUG c.m.a.s.primitives.MessagingFactory - Closed all links on the connection. Number of links '0' 188311 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] INFO c.m.a.s.primitives.MessagingFactory - Connection to host closed. 188312 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] DEBUG c.m.a.s.amqp.SessionHandler - onSessionFinal - entityName: $cbs 188312 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] DEBUG c.m.a.s.amqp.SessionHandler - onSessionFinal - entityName: $cbs 188312 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] DEBUG c.m.a.s.amqp.SessionHandler - onSessionFinal - entityName: sip_out 188312 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] DEBUG c.m.a.s.amqp.ConnectionHandler - onConnectionFinal: hostname:experimental-sipk8s-bus-01.servicebus.windows.net:5671 188332 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] INFO c.m.a.s.primitives.MessagingFactory - Gracefully releasing reactor thread as messaging factory is closed 188332 [ReactorThreadd0f71150-e9a2-4933-8550-9a6b41f2224c] INFO c.m.a.s.primitives.MessagingFactory - Stopping reactor

ntherrien commented 5 years ago

Question: how would one listen to the "close" event so I can write code to reconnect???

yvgopal commented 5 years ago

You don't have to listen for close event and reconnect. Close is also called by finalize. Could you check if your queueclient instance is getting garbage collected for any reason?

ntherrien commented 5 years ago

Hi. I dont think the object is getting garbage collected. See for yourself in this console sample app: package test;

import com.microsoft.azure.servicebus.Message; import com.microsoft.azure.servicebus.MessageHandlerOptions; import com.microsoft.azure.servicebus.QueueClient; import com.microsoft.azure.servicebus.ReceiveMode; import com.microsoft.azure.servicebus.primitives.ConnectionStringBuilder; import com.microsoft.azure.servicebus.primitives.ServiceBusException;

import java.time.Duration; import java.util.UUID; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors;

public class Main { private static final String sipOutQueueConnectionString = "Endpoint=***;EntityPath=sip_out";

public static void main(String[] args) {
    final QueueClient sipOutQueue;
    try {
        final String sipInQueueConnectionString = "Endpoint=***;EntityPath=sip_in";
        final QueueClient sipInQueue = new QueueClient(new ConnectionStringBuilder(sipInQueueConnectionString), ReceiveMode.PEEKLOCK);
        sipOutQueue = new QueueClient(new ConnectionStringBuilder(sipOutQueueConnectionString), ReceiveMode.PEEKLOCK);

        // We are using single thread executor as we are only processing one message at a time
        ExecutorService executorService = Executors.newSingleThreadExecutor();

        // register the RegisterMessageHandler callback with executor service
        final OutgoingMessagesApplicationCallback outgoingMessagesApplicationCallback = new OutgoingMessagesApplicationCallback();
        sipOutQueue.registerMessageHandler(outgoingMessagesApplicationCallback,
                // 1 concurrent call, messages are auto-completed, auto-renew duration
                new MessageHandlerOptions(1, true, Duration.ofMinutes(1)),
                executorService);

        while(true) {
            String callid = UUID.randomUUID().toString(); //745-14184
            Message message = new Message("INVITE sip:service@10.10.1.1:5060 SIP/2.0\n" +
                    "Via: SIP/2.0/UDP 10.10.1.10:5060;branch=z9hG4bK-14184-745-0\n" +
                    "From: \"sipp\" <sip:sipp@10.10.1.10:5060>;tag=14184SIPpTag00745\n" +
                    "To: \"service\" <sip:service@10.10.1.1:5060>\n" +
                    "Call-ID: " + callid + "@10.10.1.10\n" +
                    "CSeq: 1 INVITE\n" +
                    "Contact: <sip:sipp@10.10.1.10:5060>\n" +
                    "Max-Forwards: 70\n" +
                    "Subject: Performance Test\n" +
                    "Content-Type: application/sdp\n" +
                    "Content-Length: 131\n" +
                    "\n" +
                    "v=0\n" +
                    "o=user1 53655765 2353687637 IN IP4 10.10.1.10\n" +
                    "s=-\n" +
                    "c=IN IP4 10.10.1.10\n" +
                    "t=0 0\n" +
                    "m=audio 6000 RTP/AVP 0\n" +
                    "a=rtpmap:0 PCMU/8000");
            message.setContentType("application/text");
            message.setLabel("Incoming SIP Message");
            message.setMessageId(UUID.randomUUID().toString());
            message.setTimeToLive(Duration.ofMinutes(2));

            System.out.println("sending to queue 'sip_in'");
            sipInQueue.send(message);
            System.out.println("sent to 'sip_in'");

            Thread.sleep(1000);
        }
    } catch (InterruptedException e) {
        e.printStackTrace();
    } catch (ServiceBusException e) {
        e.printStackTrace();
    }
}

} `

ntherrien commented 5 years ago

Here is the sip out message handler: `package test;

import com.microsoft.azure.servicebus.*; import gov.nist.javax.sip.message.SIPMessage; import gov.nist.javax.sip.parser.StringMsgParser; import org.slf4j.Logger; import org.slf4j.LoggerFactory;

import java.text.ParseException; import java.util.concurrent.CompletableFuture;

public class OutgoingMessagesApplicationCallback implements IMessageHandler {

private static Logger LOGGER = LoggerFactory.getLogger(OutgoingMessagesApplicationCallback.class);

public OutgoingMessagesApplicationCallback() {
}

@Override
public CompletableFuture<Void> onMessageAsync(IMessage message) {
    // receives message is passed to callback
    try {
        if (message.getLabel() != null &&
                message.getContentType() != null &&
                message.getLabel().contentEquals("Outgoing SIP Message") &&
                message.getContentType().contentEquals("application/text")) {

            byte[] data = message.getBody();

            StringMsgParser parser = new StringMsgParser();
            SIPMessage sipMessage = null;
            sipMessage = parser.parseSIPMessage(data, true, false, null);

            System.out.println("Data print");

        }
    } catch (ParseException e) {
        LOGGER.error("Failed to parse SIP message.", e);
        e.printStackTrace();System.exit(-1);
    } catch (Exception e) {
        LOGGER.error("Unexpected exception.", e);
        e.printStackTrace();System.exit(-1);
    }
    return CompletableFuture.completedFuture(null);
}

@Override
public void notifyException(Throwable throwable, ExceptionPhase exceptionPhase) {
    LOGGER.error(exceptionPhase + "-" + throwable.getMessage(), throwable);
    throwable.printStackTrace();System.exit(-1);
}

} `

ntherrien commented 5 years ago

With the simplified sample code provided above, we can reproduce the problem.

Remove the sip_in queue and sip_out will have no issues.

ntherrien commented 5 years ago

We also tried a different queue implementation and have no issues with the exact same backend code.

ntherrien commented 5 years ago

Any update @yvgopal ?

yvgopal commented 5 years ago

Pasting some fragments from the logs you posted..

[Finalizer] INFO com.microsoft.azure.servicebus.MessageAndSessionPump - Closing message and session pump on entity 'sip_out' 18:24:48.356 [Finalizer] INFO com.microsoft.azure.servicebus.MessageAndSessionPump - Closing message and >session pump on entity 'sip_out'

Every time the QueueClient is closed by the Finalizer thread. That is the garbage collector, finalizing the QueueClient instance. Period.

Why is it getting finalized? I can't say. May be the System.exit() call you have in the notifyException() method of IMessageHandler. I have just found it. Why would you put System.exit() in a code path? Remove System.exit call. Check your code, check what is printed when it is running, and check the environment it is running on. I am not closing it this time. But I expect you to fix your application bug and close this issue.

This SDK is being used by many customers. Something as simple as this issue wouldn't have gone unnoticed.

ntherrien commented 5 years ago

Hi @yvgopal ,

From one professional to another, let's be careful when saying "other customers are using it, so the problem must be you". There's always a possibility that your customer's code may be problematic, but let's just keep an open mind here.

I am trying to provide you with as much facts as you need.

Fact 1: The program is not releasing its reference on the queue object You already mentioned the finalizer logs and I took your observation seriously. I got the team to review the code thoroughly to be sure. That's when the team added the System.exit(-1), a debugging method called "fail fast" which will cause the process to stop on any issue.

Keep in mind that with our problem, the receiving queue stops receiving, but the program itself doesn't stop. In fact, the outgoing queue still works when that happens. If one of our System.exit were hit, the entire app would go down. Not the same symptom. Plus, we added those for debugging purposes, the problem was already there before we did that in the example console app.

You also asked that we do a small console app to help spot out the issue. We did that and provided the code to you in this thread. Did you try it?

Fact 2: The problem only occurs when using two queues concurrently AND under load I think this is important to note. From my experience, a common design with queues is that each app usually either subscribe or publish but rarely both. Our team noticed that the problem does NOT occur when using only the receiving queue (sip_out). The sending queue must be created and both queues must be under load for the problem to occur.

This is important. It proves that the code does not destroy the receiving queue. If we had a bug in our console example app, the same finalizer log would show up using only the receiving queue.

Only when we add the sending queue AND put some load will the problem eventually occur.

Does the azure-service-bus-java library have a live load test involving concurrent publish subscribe operations within the same app as part of its devops build pipeline? Is this something that's verified from build to build?

Fact 3: Swapping out the library for another fixes this problem When I say we took your Finalizer theory seriously, I really mean it. To be extra sure the problem couldn't be just that, and also because our team has deadlines to meet, we have decided to replace the Queue library with a REST api.

A REST api is hardly the same as using Qpid, but when you think about it, it is still a message passing technique, except there is no broker.

Important fact: swapping out this message passing layer also solves the problem. What this proves is that the business logic from our real application is most likely okay. I think this further entices the rationale that we should probably stop looking for a cause in the customer code and start considering another explanation for the Finalizer logs.

I think this summarizes the facts that we have for you at this time. Hopefully, these will help you find an alternate explanation as to how this can happen.

I did some research on other similar issues on this library and found this:

243 Internal receive link of requestresponselink to '$cbs' closed with error. (this guy is also using the queues in a concurrent fashion)

Could this be related?

I also want to say that since our team has moved away from this library in our production environment, I am okay with closing this issue if, from your perspective, you think you have exhausted all your options to diagnose this issue.

axisc commented 5 years ago

@ntherrien Thanks for your continued support and trying out the library.

A couple of things to note here -

Both the sender code for sip_in, as well as the receiver code for sip_out will have it's own connection so they should be decoupled from each other. This explains why -

It proves that the code does not destroy the receiving queue. If we had a bug in our console example app, the same finalizer log would show up using only the receiving queue.

I believe that all the options have been exhausted, so I'll close this for now. If you still feel you'd like us to explore this, please feel free to open a support ticket.

bobcx2 commented 5 years ago

I had the very same issue due to my queueclient instance being garbage collected. I saw the same logs and it happened at random intervals. I know this issue has been closed but I thought confirmation of this would be helpful.

bobcx2 commented 5 years ago

I meant to say, @ntherrien , if you move your QueueClient variables out of the main method and into instance variables in the code you included at: https://github.com/Azure/azure-service-bus-java/issues/351#issuecomment-480320849 . Do you still see the issue?

ntherrien commented 5 years ago

@bobcx2, just in case our team may have missed something, I did what you suggested. No change, bug still occurs under high load.

The behavior is consistent with the fact that all references to the variables remain valid until the main method exits (scoped variables are valid until the scope ends). If you debug the main function, you'll see there is a loop with a sleep that prevents the main method from exiting). Any exception in that loop will cause the loop to exit and print the exception to stdout.

But like I said, I had nothing to lose so I moved the queueclient variables out of the main method with no change to problem.

I think how the finalizer log is interpreted may be misleading the investigation here. Can't a problem occur in QueueClient that would trigger the finalizer? As far as i understand, the MessageAndSessionPump finalizer is not the same as a QueueClient, is it? Could the MessageAndSessionPump be destroyed for a reason other than QueueClient being garbage collected?

bobcx2 commented 5 years ago

That's interesting @ntherrien .

I agree that it seems like the MessageAndSessionPump could be destroyed for a reason other than the QueueClient being cleaned up.

To tell if this is the issue, can you add a log line at the top of your while loop and see if the loop is still running after the finalizer log line shows up? That should tell you if the loop is still running and therefore if your QueueClients are still in scope. (You may want to add a try/catch for all exceptions in your while loop to ensure it still continues to run even after an exception happens). Then if you can see your while loop is still running after finalize method that there may be a bug in the SDK.

Hopefully that makes sense. I'm eager to find out what you find as I hoped I had fixed my issue but if there is indeed an issue with the SDK that could strike at any minute then I'd like to know about it!

Thanks

Andycharalambous commented 3 years ago

I'm experiencing a very similar problem to this - and i can reproduce without any load at all - just start listening to a queue and let the application run for 10-20 minutes before i see the same logs saying things are being closed. At this point even if a message does appear in the queue, it will not be picked up until the application is restarted. Having read the thread on here, I'm attempting to figure out if my QueueClient is being garbage collected, but the MessageAndSessionPump being finalized due to garbage collection does seem more plausible. Does anyone have any more information to share since last year? I'm going crazy trying to figure this out.