Azure / azure-sdk-for-java

This repository is for active development of the Azure SDK for Java. For consumers of the SDK we recommend visiting our public developer docs at https://docs.microsoft.com/java/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-java.
MIT License
2.25k stars 1.93k forks source link

Microsoft Azure Service Bus - Session closed when producing messages #37641

Open Andrexams opened 7 months ago

Andrexams commented 7 months ago

Query/Question Dear,

Our application needs to send many(10k+) messages to a queue, we implement an outbox pattern, several records are updated to the pending state and a job reads these records and sends them to a queue. For performance, we split these records into a batch packet of 25 messages, send to the queue using the jmsTemplate(transacted) producer callback, for every batch a commit or rollback session is called. We are facing a problem, some messages were lost and we received warning messages saying “The session is closed”. This warning appears for the producer thread but appears for random @JmsListeners consumers too. Due to warning message and not a exception we update registers like ok in database but acctualy some messages are lost.

We are unable to reproduce locally and this issue occurs intermittently in our environment. We have 24 @JMSlistener consumers and 2 jobs for sending messages.

We think its related to https://github.com/Azure/azure-sdk-for-java/issues/37042.

Could you help us please?

Producer stack:

Caught exception trying rollback() when putting session back into the pool, will invalidate. jakarta.jms.IllegalStateException: The Session is closed
jakarta.jms.IllegalStateException: The Session is closed
    at org.apache.qpid.jms.JmsSession.checkClosed(JmsSession.java:1113)
    at org.apache.qpid.jms.JmsSession.rollback(JmsSession.java:264)
    at org.messaginghub.pooled.jms.JmsPoolSession.close(JmsPoolSession.java:112)
    at org.springframework.jms.support.JmsUtils.closeSession(JmsUtils.java:109)
    at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:513)
    at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:534)
    at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:526)
    at br.com.a.a.t.integration.usecase.integration.adapter.external.job.SendToOffTradeJobBase.enqueue(SendToOffTradeJobBase.java:84)
    at br.com.a.a.t.integration.usecase.integration.adapter.external.job.SendToOffTradeJobBase.updateStatusAndEnqueue(SendToOffTradeJobBase.java:77)
    at br.com.a.a.t.integration.usecase.integration.adapter.external.job.SendToOffTradeJobBase.prepareAndSend(SendToOffTradeJobBase.java:71)
    at br.com.a.a.t.integration.usecase.integration.adapter.external.job.SendToOffTradeJobBase.execute(SendToOffTradeJobBase.java:57)
    at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
    at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573)

@JmsListener stack

Setup of JMS message listener invoker failed for destination 'queue-name' - trying to recover. Cause: The Session is closed 
Caught exception trying rollback() when putting session back into the pool, will invalidate. jakarta.jms.IllegalStateException: The Session is closed
jakarta.jms.IllegalStateException: The Session is closed
    at org.apache.qpid.jms.JmsSession.checkClosed(JmsSession.java:1113)
    at org.apache.qpid.jms.JmsSession.rollback(JmsSession.java:264)
    at org.messaginghub.pooled.jms.JmsPoolSession.close(JmsPoolSession.java:112)
    at org.springframework.jms.support.JmsUtils.closeSession(JmsUtils.java:109)
    at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.clearResources(DefaultMessageListenerContainer.java:1291)
    at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.run(DefaultMessageListenerContainer.java:1137)
    at java.base/java.lang.Thread.run(Thread.java:831)

Pool configuration:

spring.jms.servicebus.idle-timeout=30000
spring.jms.servicebus.pool.enabled=true
spring.jms.servicebus.pool.max-connections=10
spring.jms.servicebus.pool.time-between-expiration-check=10000
spring.jms.servicebus.pool.idle-timeout=60000

JmsTemplate producer:

public void enqueue(List<StimulusConsolidation> stimulusConsolidationList) {
        jmsTemplate.setSessionTransacted(true);
        jmsTemplate.execute(getMessageProducerCallback(stimulusConsolidationList));
    }

    public ProducerCallback<Message> getMessageProducerCallback(List<StimulusConsolidation> stimulusConsolidationList) {
        return (session, producer) -> {
            Queue queue = session.createQueue(getQueueName());
            try {
                for (StimulusConsolidation e : stimulusConsolidationList) {
                    producer.send(queue, appMessageConverter.toMessage(e, session));
                }
                session.commit();
            } catch (Exception e) {
                session.rollback();
                throw e;
            }
            return null;
        };
    }

Setup (please complete the following information if applicable):

neffsvg commented 7 months ago

had the same or a similar issue - we fixed it by setting "partitioning" to false and enabled "dead lettering on message expiration" for all queues

Andrexams commented 7 months ago

had the same or a similar issue - we fixed it by setting "partitioning" to false and enabled "dead lettering on message expiration" for all queues

Hello @neffsvg,

Thanks for the answer but we are using the same configuration here.

Netyyyy commented 7 months ago

Hi @Andrexams , thanks for reaching out. We have received your submission and will take it into consideration. We appreciate your input and will review this matter as soon as possible. Please feel free to provide any additional information or context that you think may be helpful. We'll keep you updated on the progress of our review. Thank you for your contribution to improving our project.

Netyyyy commented 1 month ago

Hi @Andrexams , ServiceBus server side has a fix recently for premium tier. So if you still meet this error in premium tier, please add environment property PN_TRACE_FRM=1 and provide the log info here, thanks.