Azure / azure-sdk-for-java

This repository is for active development of the Azure SDK for Java. For consumers of the SDK we recommend visiting our public developer docs at https://docs.microsoft.com/java/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-java.
MIT License
2.35k stars 1.98k forks source link

[BUG] JMSTemplate is invalid state after 10 minutes idle #31966

Open stliu opened 1 year ago

stliu commented 1 year ago

Describe the bug When using JMSTemplate to send message to Service Bus, it throws exception if the connection is idle more than 10 minutes

another issue. is the performance, it takes 5 seconds to send the first message, see the log below for more detials

Exception or Stack Trace

2022-11-06 11:12:05.762  INFO 25944 --- [   scheduling-1] c.e.demo.ServiceBusJMSMessageProducer    : Sending message: 2022-11-06T11:12:05.762072 message 1
2022-11-06 11:12:05.772 ERROR 25944 --- [   scheduling-1] o.s.s.s.TaskUtils$LoggingErrorHandler    : Unexpected error occurred in scheduled task

org.springframework.jms.IllegalStateException: The MessageProducer was closed due to an unrecoverable error.; nested exception is javax.jms.IllegalStateException: The MessageProducer was closed due to an unrecoverable error.
    at org.springframework.jms.support.JmsUtils.convertJmsAccessException(JmsUtils.java:274) ~[spring-jms-5.3.23.jar:5.3.23]
    at org.springframework.jms.support.JmsAccessor.convertJmsAccessException(JmsAccessor.java:185) ~[spring-jms-5.3.23.jar:5.3.23]
    at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:507) ~[spring-jms-5.3.23.jar:5.3.23]
    at org.springframework.jms.core.JmsTemplate.send(JmsTemplate.java:584) ~[spring-jms-5.3.23.jar:5.3.23]
    at org.springframework.jms.core.JmsTemplate.convertAndSend(JmsTemplate.java:661) ~[spring-jms-5.3.23.jar:5.3.23]
    at com.example.demo.ServiceBusJMSMessageProducer.publishMessage(ServiceBusJMSMessageProducer.java:41) ~[classes/:na]
    at com.example.demo.ServiceBusJMSMessageProducer.publishMessage(ServiceBusJMSMessageProducer.java:32) ~[classes/:na]
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:na]
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) ~[na:na]
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:na]
    at java.base/java.lang.reflect.Method.invoke(Method.java:568) ~[na:na]
    at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84) ~[spring-context-5.3.23.jar:5.3.23]
    at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54) ~[spring-context-5.3.23.jar:5.3.23]
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[na:na]
    at java.base/java.util.concurrent.FutureTask.runAndReset$$$capture(FutureTask.java:305) ~[na:na]
    at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java) ~[na:na]
    at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) ~[na:na]
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[na:na]
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[na:na]
    at java.base/java.lang.Thread.run(Thread.java:833) ~[na:na]
Caused by: javax.jms.IllegalStateException: The MessageProducer was closed due to an unrecoverable error.
    at org.apache.qpid.jms.JmsMessageProducer.checkClosed(JmsMessageProducer.java:324) ~[qpid-jms-client-0.53.0.jar:na]
    at org.apache.qpid.jms.JmsMessageProducer.getDeliveryMode(JmsMessageProducer.java:135) ~[qpid-jms-client-0.53.0.jar:na]
    at org.springframework.jms.connection.CachedMessageProducer.<init>(CachedMessageProducer.java:60) ~[spring-jms-5.3.23.jar:5.3.23]
    at org.springframework.jms.connection.CachingConnectionFactory$CachedSessionInvocationHandler.getCachedProducer(CachingConnectionFactory.java:434) ~[spring-jms-5.3.23.jar:5.3.23]
    at org.springframework.jms.connection.CachingConnectionFactory$CachedSessionInvocationHandler.invoke(CachingConnectionFactory.java:363) ~[spring-jms-5.3.23.jar:5.3.23]
    at jdk.proxy2/jdk.proxy2.$Proxy66.createProducer(Unknown Source) ~[na:na]
    at org.springframework.jms.core.JmsTemplate.doCreateProducer(JmsTemplate.java:1115) ~[spring-jms-5.3.23.jar:5.3.23]
    at org.springframework.jms.core.JmsTemplate.createProducer(JmsTemplate.java:1096) ~[spring-jms-5.3.23.jar:5.3.23]
    at org.springframework.jms.core.JmsTemplate.doSend(JmsTemplate.java:602) ~[spring-jms-5.3.23.jar:5.3.23]
    at org.springframework.jms.core.JmsTemplate.lambda$send$3(JmsTemplate.java:586) ~[spring-jms-5.3.23.jar:5.3.23]
    at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:504) ~[spring-jms-5.3.23.jar:5.3.23]
    ... 17 common frames omitted
Caused by: org.apache.qpid.jms.provider.ProviderException: The link 'G0:36906660:qpid-jms:sender:azure:5caf3ef4-9602-413c-964d-cf1292d6e1f5:1:1:1:t4' is force detached. Code: publisher(link376). Details: AmqpMessagePublisher.IdleTimerExpired: Idle timeout: 00:10:00. [condition = amqp:link:detach-forced]
    at org.apache.qpid.jms.provider.amqp.AmqpSupport.convertToNonFatalException(AmqpSupport.java:181) ~[qpid-jms-client-0.53.0.jar:na]
    at org.apache.qpid.jms.provider.amqp.AmqpAbstractResource.processRemoteClose(AmqpAbstractResource.java:269) ~[qpid-jms-client-0.53.0.jar:na]
    at org.apache.qpid.jms.provider.amqp.AmqpProvider.processUpdates(AmqpProvider.java:985) ~[qpid-jms-client-0.53.0.jar:na]
    at org.apache.qpid.jms.provider.amqp.AmqpProvider.onData(AmqpProvider.java:871) ~[qpid-jms-client-0.53.0.jar:na]
    at org.apache.qpid.jms.transports.netty.NettyTcpTransport$NettyTcpTransportHandler.channelRead0(NettyTcpTransport.java:563) ~[qpid-jms-client-0.53.0.jar:na]
    at org.apache.qpid.jms.transports.netty.NettyTcpTransport$NettyTcpTransportHandler.channelRead0(NettyTcpTransport.java:556) ~[qpid-jms-client-0.53.0.jar:na]
    at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1373) ~[netty-handler-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1236) ~[netty-handler-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1285) ~[netty-handler-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:519) ~[netty-codec-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:458) ~[netty-codec-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:280) ~[netty-codec-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[netty-common-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[netty-common-4.1.84.Final.jar:4.1.84.Final]
    ... 1 common frames omitted

To Reproduce

Add the code snippet that causes the issue.

it can be reproducecd by this code

logger.info("sending the 1st message");
template.convertAndSend(destination, "message 1");
Duration duration = Duration.ofMinutes(10);
logger.info("waiting for {} seconds", duration.getSeconds());
TimeUnit.SECONDS.sleep(duration.getSeconds());
logger.info("sending the 2nd message");
template.convertAndSend(destination, "message 2");

I'm using

<dependency>
  <groupId>com.azure.spring</groupId>
  <artifactId>spring-cloud-azure-starter-servicebus-jms</artifactId>
  <version>4.4.1</version>
</dependency>

Expected behavior

There are two problems with this issue:

the underlying connection is not kept alive

with tracing log enabled, we can see, there is a IdleTimeoutCheck is running every 60 seconds to try to keep the connection alive by sending an Empty Frame, but seems this doesn't keep the connection alive, so how to keep the connection alive? is it related to https://github.com/Azure/azure-sdk-for-python/pull/10209?

2022-11-06 11:03:04.757 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] SENT: Empty Frame
2022-11-06 11:03:04.758 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : IdleTimeoutCheck rescheduling with delay: 60000
2022-11-06 11:03:48.137 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Empty Frame
2022-11-06 11:04:04.760 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] SENT: Empty Frame
2022-11-06 11:04:04.761 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : IdleTimeoutCheck rescheduling with delay: 60000

org.springframework.jms.connection.SingleConnectionFactory#reconnectOnException is not honoured.

Spring Cloud Azure uses CachingConnectionFactory by default, according to its javadoc

This ConnectionFactory also switches the "reconnectOnException" property to "true" by default, allowing for automatic recovery of the underlying Connection.

From the current behavior of this exception, it doesn't auto recovery from the underlying connection.

I have the full log attached below

2022-11-06 10:57:01.630  INFO 25944 --- [   scheduling-1] c.e.demo.ServiceBusJMSMessageProducer    : Sending message: 2022-11-06T10:57:01.630090 message 0
2022-11-06 10:57:03.609 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] SENT: SASL
2022-11-06 10:57:03.875 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: SASL
2022-11-06 10:57:03.875 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: CONNECTION_INIT
2022-11-06 10:57:03.877 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: SaslMechanisms{saslServerMechanisms=[MSSBCBS, PLAIN, ANONYMOUS, EXTERNAL]}
2022-11-06 10:57:03.896 DEBUG 25944 --- [windows.net:-1]] o.a.qpid.jms.sasl.SaslMechanismFinder    : Best match for SASL auth was: SASL-PLAIN
2022-11-06 10:57:03.897 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] SENT: SaslInit{mechanism=PLAIN, initialResponse=\x00RootManageSharedAccessKey\x00+GtCmBe/x4v6I5/mxEHKv6JxVrRu0PrKo3UjN1csjYg=, hostname='shaozliu.servicebus.windows.net'}
2022-11-06 10:57:04.165 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: SaslOutcome{_code=OK, _additionalData=Welcome!}
2022-11-06 10:57:04.166 DEBUG 25944 --- [windows.net:-1]] o.a.qpid.proton.engine.impl.SaslImpl     : Handled outcome: SaslImpl [_outcome=PN_SASL_OK, state=PN_SASL_PASS, done=true, role=CLIENT]
2022-11-06 10:57:04.167 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] SENT: AMQP
2022-11-06 10:57:04.190 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] SENT: Open{ containerId='azure:2657a071-8d5d-427a-86f9-d6e76ffa4a5e:1', hostname='shaozliu.servicebus.windows.net', maxFrameSize=1048576, channelMax=32767, idleTimeOut=50000, outgoingLocales=null, incomingLocales=null, offeredCapabilities=null, desiredCapabilities=[sole-connection-for-container, DELAYED_DELIVERY, ANONYMOUS-RELAY, SHARED-SUBS], properties={com.microsoft:is-client-provider=true, product=QpidJMS, version=0.53.0, platform=JVM: 17.0.4.1, 17.0.4.1+1-LTS, Microsoft, OS: Mac OS X, 12.6.1, x86_64}}
2022-11-06 10:57:04.434 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: AMQP
2022-11-06 10:57:04.434 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: CONNECTION_LOCAL_OPEN
2022-11-06 10:57:04.459 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Open{ containerId='07bc72c10fde44afb67c5f833eeb5e41_G0', hostname='null', maxFrameSize=65536, channelMax=4999, idleTimeOut=120000, outgoingLocales=null, incomingLocales=null, offeredCapabilities=null, desiredCapabilities=null, properties=null}
2022-11-06 10:57:04.459 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: CONNECTION_REMOTE_OPEN
2022-11-06 10:57:04.465 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: SESSION_INIT
2022-11-06 10:57:04.465 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: SESSION_LOCAL_OPEN
2022-11-06 10:57:04.468 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] SENT: Begin{remoteChannel=null, nextOutgoingId=1, incomingWindow=2047, outgoingWindow=2147483647, handleMax=65535, offeredCapabilities=null, desiredCapabilities=null, properties=null}
2022-11-06 10:57:04.735 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Begin{remoteChannel=0, nextOutgoingId=1, incomingWindow=5000, outgoingWindow=2047, handleMax=255, offeredCapabilities=null, desiredCapabilities=null, properties=null}
2022-11-06 10:57:04.736 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: SESSION_REMOTE_OPEN
2022-11-06 10:57:04.736 DEBUG 25944 --- [windows.net:-1]] o.a.q.j.p.a.b.AmqpConnectionBuilder      : AmqpConnection { azure:5caf3ef4-9602-413c-964d-cf1292d6e1f5:1 } is now open: 
2022-11-06 10:57:04.736 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : IdleTimeoutCheck being initiated, initial delay: 60000
2022-11-06 10:57:04.737  INFO 25944 --- [windows.net:-1]] org.apache.qpid.jms.JmsConnection        : Connection azure:5caf3ef4-9602-413c-964d-cf1292d6e1f5:1 connected to server: amqps://shaozliu.servicebus.windows.net/
2022-11-06 10:57:04.737  INFO 25944 --- [windows.net:-1]] c.e.d.j.p.s.AzureServiceBusProvider      : Connection established amqps://shaozliu.servicebus.windows.net/?amqp.idleTimeout=100000&amqp.traceFrames=true
2022-11-06 10:57:04.751 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:1] SENT: Begin{remoteChannel=null, nextOutgoingId=1, incomingWindow=2047, outgoingWindow=2147483647, handleMax=65535, offeredCapabilities=null, desiredCapabilities=null, properties=null}
2022-11-06 10:57:05.064 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:1] RECV: Begin{remoteChannel=1, nextOutgoingId=1, incomingWindow=5000, outgoingWindow=2047, handleMax=255, offeredCapabilities=null, desiredCapabilities=null, properties=null}
2022-11-06 10:57:05.064 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: SESSION_INIT
2022-11-06 10:57:05.064 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: SESSION_LOCAL_OPEN
2022-11-06 10:57:05.064 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: SESSION_REMOTE_OPEN
2022-11-06 10:57:05.077 DEBUG 25944 --- [windows.net:-1]] o.a.q.j.p.a.b.AmqpProducerBuilder        : Creating AmqpFixedProducer for: t4
2022-11-06 10:57:05.085 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:1] SENT: Attach{name='qpid-jms:sender:azure:5caf3ef4-9602-413c-964d-cf1292d6e1f5:1:1:1:t4', handle=0, role=SENDER, sndSettleMode=UNSETTLED, rcvSettleMode=FIRST, source=Source{address='azure:5caf3ef4-9602-413c-964d-cf1292d6e1f5:1:1:1', durable=NONE, expiryPolicy=SESSION_END, timeout=0, dynamic=false, dynamicNodeProperties=null, distributionMode=null, filter=null, defaultOutcome=null, outcomes=[amqp:accepted:list, amqp:rejected:list, amqp:released:list, amqp:modified:list], capabilities=null}, target=Target{address='t4', durable=NONE, expiryPolicy=SESSION_END, timeout=0, dynamic=false, dynamicNodeProperties=null, capabilities=[queue]}, unsettled=null, incompleteUnsettled=false, initialDeliveryCount=0, maxMessageSize=null, offeredCapabilities=null, desiredCapabilities=[DELAYED_DELIVERY], properties=null}
2022-11-06 10:57:05.377 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:1] RECV: Attach{name='qpid-jms:sender:azure:5caf3ef4-9602-413c-964d-cf1292d6e1f5:1:1:1:t4', handle=0, role=RECEIVER, sndSettleMode=UNSETTLED, rcvSettleMode=FIRST, source=Source{address='azure:5caf3ef4-9602-413c-964d-cf1292d6e1f5:1:1:1', durable=NONE, expiryPolicy=SESSION_END, timeout=0, dynamic=false, dynamicNodeProperties=null, distributionMode=null, filter=null, defaultOutcome=null, outcomes=[amqp:accepted:list, amqp:rejected:list, amqp:released:list, amqp:modified:list], capabilities=null}, target=Target{address='t4', durable=NONE, expiryPolicy=SESSION_END, timeout=0, dynamic=false, dynamicNodeProperties=null, capabilities=[queue]}, unsettled=null, incompleteUnsettled=false, initialDeliveryCount=null, maxMessageSize=1048576, offeredCapabilities=[DELAYED_DELIVERY], desiredCapabilities=null, properties=null}
2022-11-06 10:57:05.377 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: LINK_INIT
2022-11-06 10:57:05.377 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: LINK_LOCAL_OPEN
2022-11-06 10:57:05.377 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: LINK_REMOTE_OPEN
2022-11-06 10:57:05.380 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:1] RECV: Flow{nextIncomingId=1, incomingWindow=5000, nextOutgoingId=1, outgoingWindow=2047, handle=0, deliveryCount=0, linkCredit=1000, available=0, drain=false, echo=false, properties=null}
2022-11-06 10:57:05.380 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: LINK_FLOW
2022-11-06 10:57:05.451 TRACE 25944 --- [windows.net:-1]] o.a.q.j.provider.amqp.AmqpFixedProducer  : Producer sending message: JmsOutboundMessageDispatch {dispatchId = azure:5caf3ef4-9602-413c-964d-cf1292d6e1f5:1:1:1-1, MessageID = ID:azure:5caf3ef4-9602-413c-964d-cf1292d6e1f5:1:1:1-1 }
2022-11-06 10:57:05.456 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:1] SENT: Transfer{handle=0, deliveryId=0, deliveryTag=\x00, messageFormat=0, settled=false, more=false, rcvSettleMode=null, state=null, resume=false, aborted=false, batchable=false} (228) "\x00Sp\xc0\x02\x01A\x00Sr\xc1)\x04\xa3\x0ex-opt-jms-destQ\x00\xa3\x12x-opt-jms-msg-typeQ\x05\x00Ss\xd0\x00\x00\x00O\x00\x00\x00\x0a\xa15ID:azure:5caf3ef4-9602-413c-964d-cf1292d6e1f5:1:1:1-1@\xa1\x02t4@@@@@@\x83\x00\x00\x01\x84J\xde\xc9\xa2\x00St\xc1\x1f\x02\xa1\x05_type\xa1\x15com.example.demo.User\x00Sw\xa1/{"name":"2022-11-06T10:57:01.630090 message 0"}"
2022-11-06 10:57:05.788 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:1] RECV: Disposition{role=RECEIVER, first=0, last=null, settled=true, state=Accepted{}, batchable=false}
2022-11-06 10:57:05.789 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: DELIVERY
2022-11-06 10:57:05.789 TRACE 25944 --- [windows.net:-1]] o.a.q.j.provider.amqp.AmqpFixedProducer  : Outcome of delivery was accepted: DeliveryImpl [_tag=[0], _link=org.apache.qpid.proton.engine.impl.SenderImpl@5d94aa39, _deliveryState=null, _settled=false, _remoteSettled=true, _remoteDeliveryState=Accepted{}, _flags=0, _defaultDeliveryState=null, _transportDelivery=org.apache.qpid.proton.engine.impl.TransportDelivery@7bac1751, _data Size=0, _complete=true, _updated=true, _done=true]

first message delivered! why it take 4 seconds to send a single message??? can we do something to speed up this? warmup during bootstrap process?

2022-11-06 10:57:49.712 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Empty Frame
2022-11-06 10:58:04.739 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : IdleTimeoutCheck rescheduling with delay: 60000
2022-11-06 10:58:33.477 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Empty Frame
2022-11-06 10:59:04.743 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] SENT: Empty Frame
2022-11-06 10:59:04.743 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : IdleTimeoutCheck rescheduling with delay: 60000
2022-11-06 10:59:17.292 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Empty Frame
2022-11-06 11:00:01.093 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Empty Frame
2022-11-06 11:00:04.454 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Flow{nextIncomingId=1, incomingWindow=5000, nextOutgoingId=1, outgoingWindow=2047, handle=null, deliveryCount=null, linkCredit=null, available=null, drain=false, echo=true, properties=null}
2022-11-06 11:00:04.746 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] SENT: Empty Frame
2022-11-06 11:00:04.747 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : IdleTimeoutCheck rescheduling with delay: 60000
2022-11-06 11:00:48.300 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Empty Frame
2022-11-06 11:01:04.749 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] SENT: Empty Frame
2022-11-06 11:01:04.750 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : IdleTimeoutCheck rescheduling with delay: 60000
2022-11-06 11:01:32.065 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Empty Frame
2022-11-06 11:02:04.753 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] SENT: Empty Frame
2022-11-06 11:02:04.754 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : IdleTimeoutCheck rescheduling with delay: 60000
2022-11-06 11:02:15.818 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Empty Frame
2022-11-06 11:02:59.682 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Empty Frame
2022-11-06 11:03:04.383 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Flow{nextIncomingId=1, incomingWindow=5000, nextOutgoingId=1, outgoingWindow=2047, handle=null, deliveryCount=null, linkCredit=null, available=null, drain=false, echo=true, properties=null}
2022-11-06 11:03:04.757 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] SENT: Empty Frame
2022-11-06 11:03:04.758 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : IdleTimeoutCheck rescheduling with delay: 60000
2022-11-06 11:03:48.137 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Empty Frame
2022-11-06 11:04:04.760 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] SENT: Empty Frame
2022-11-06 11:04:04.761 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : IdleTimeoutCheck rescheduling with delay: 60000
2022-11-06 11:04:31.980 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Empty Frame
2022-11-06 11:05:04.764 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] SENT: Empty Frame
2022-11-06 11:05:04.765 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : IdleTimeoutCheck rescheduling with delay: 60000
2022-11-06 11:05:15.796 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Empty Frame
2022-11-06 11:05:59.580 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Empty Frame

--- seems Server send this message every 3 minutes

2022-11-06 11:06:04.333 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Flow{nextIncomingId=1, incomingWindow=5000, nextOutgoingId=1, outgoingWindow=2047, handle=null, deliveryCount=null, linkCredit=null, available=null, drain=false, echo=true, properties=null}
2022-11-06 11:06:04.738 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] SENT: Empty Frame

---> looks like this IdleTimeoutCheck doesn't work, maybe Azuer Service bus doesn't expect an empty frame?

2022-11-06 11:06:04.740 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : IdleTimeoutCheck rescheduling with delay: 60000
2022-11-06 11:06:48.157 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Empty Frame
2022-11-06 11:07:04.728 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] SENT: Empty Frame
2022-11-06 11:07:04.729 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : IdleTimeoutCheck rescheduling with delay: 60000
2022-11-06 11:07:05.760 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:1] RECV: Detach{handle=0, closed=true, error=Error{condition=amqp:link:detach-forced, description='The link 'G0:36906660:qpid-jms:sender:azure:5caf3ef4-9602-413c-964d-cf1292d6e1f5:1:1:1:t4' is force detached. Code: publisher(link376). Details: AmqpMessagePublisher.IdleTimerExpired: Idle timeout: 00:10:00.', info=null}}
2022-11-06 11:07:05.760 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: LINK_REMOTE_CLOSE
2022-11-06 11:07:05.769 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:1] SENT: Detach{handle=0, closed=true, error=null}
2022-11-06 11:07:05.771  INFO 25944 --- [windows.net:-1]] c.e.d.j.p.s.AzureServiceBusProvider      : Resource closed JmsProducerInfo { azure:5caf3ef4-9602-413c-964d-cf1292d6e1f5:1:1:1, destination = t4 }
2022-11-06 11:07:05.772 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: LINK_LOCAL_CLOSE
2022-11-06 11:07:05.772  INFO 25944 --- [-cf1292d6e1f5:1] org.apache.qpid.jms.JmsSession           : A JMS MessageProducer has been closed: JmsProducerInfo { azure:5caf3ef4-9602-413c-964d-cf1292d6e1f5:1:1:1, destination = t4 }
2022-11-06 11:07:05.772 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: LINK_FINAL

--> Now in LINK_FINAL state, takes 10 minutes, is it a hard limit from Service Bus? is there a doc saying that? https://github.com/Azure/azure-sdk-for-python/issues/10127 https://github.com/Azure/azure-sdk-for-python/pull/10209

2022-11-06 11:07:49.511 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Empty Frame
2022-11-06 11:08:04.730 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : IdleTimeoutCheck rescheduling with delay: 60000
2022-11-06 11:08:33.292 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Empty Frame
2022-11-06 11:09:04.233 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Flow{nextIncomingId=1, incomingWindow=5000, nextOutgoingId=1, outgoingWindow=2047, handle=null, deliveryCount=null, linkCredit=null, available=null, drain=false, echo=true, properties=null}
2022-11-06 11:09:04.732 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] SENT: Empty Frame
2022-11-06 11:09:04.733 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : IdleTimeoutCheck rescheduling with delay: 60000
2022-11-06 11:09:47.983 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Empty Frame
2022-11-06 11:10:04.735 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] SENT: Empty Frame
2022-11-06 11:10:04.736 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : IdleTimeoutCheck rescheduling with delay: 60000
2022-11-06 11:10:31.847 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Empty Frame
2022-11-06 11:11:04.737 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] SENT: Empty Frame
2022-11-06 11:11:04.738 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : IdleTimeoutCheck rescheduling with delay: 60000
2022-11-06 11:11:15.779 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Empty Frame
2022-11-06 11:11:59.410 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Empty Frame
2022-11-06 11:12:04.149 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Flow{nextIncomingId=1, incomingWindow=5000, nextOutgoingId=1, outgoingWindow=2047, handle=null, deliveryCount=null, linkCredit=null, available=null, drain=false, echo=true, properties=null}
2022-11-06 11:12:04.739 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] SENT: Empty Frame
2022-11-06 11:12:04.740 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : IdleTimeoutCheck rescheduling with delay: 60000

--> About to send a new message, the first one was sent in 2022-11-06 10:57:01.630, and we have a 15 minutes scheduler

    @Scheduled(initialDelay = 0, fixedDelay = 15, timeUnit = TimeUnit.MINUTES)

//    @Scheduled(initialDelay = 5, fixedDelay = 1, timeUnit = TimeUnit.SECONDS)
    public void publishMessage() {
        String message = String.format("%s %s %s", LocalDateTime.now(), "message", counter.getAndIncrement());
        publishMessage(message);
    }
2022-11-06 11:12:05.762  INFO 25944 --- [   scheduling-1] c.e.demo.ServiceBusJMSMessageProducer    : Sending message: 2022-11-06T11:12:05.762072 message 1
2022-11-06 11:12:05.772 ERROR 25944 --- [   scheduling-1] o.s.s.s.TaskUtils$LoggingErrorHandler    : Unexpected error occurred in scheduled task

org.springframework.jms.IllegalStateException: The MessageProducer was closed due to an unrecoverable error.; nested exception is javax.jms.IllegalStateException: The MessageProducer was closed due to an unrecoverable error.
    at org.springframework.jms.support.JmsUtils.convertJmsAccessException(JmsUtils.java:274) ~[spring-jms-5.3.23.jar:5.3.23]
    at org.springframework.jms.support.JmsAccessor.convertJmsAccessException(JmsAccessor.java:185) ~[spring-jms-5.3.23.jar:5.3.23]
    at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:507) ~[spring-jms-5.3.23.jar:5.3.23]
    at org.springframework.jms.core.JmsTemplate.send(JmsTemplate.java:584) ~[spring-jms-5.3.23.jar:5.3.23]
    at org.springframework.jms.core.JmsTemplate.convertAndSend(JmsTemplate.java:661) ~[spring-jms-5.3.23.jar:5.3.23]
    at com.example.demo.ServiceBusJMSMessageProducer.publishMessage(ServiceBusJMSMessageProducer.java:41) ~[classes/:na]
    at com.example.demo.ServiceBusJMSMessageProducer.publishMessage(ServiceBusJMSMessageProducer.java:32) ~[classes/:na]
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:na]
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) ~[na:na]
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:na]
    at java.base/java.lang.reflect.Method.invoke(Method.java:568) ~[na:na]
    at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84) ~[spring-context-5.3.23.jar:5.3.23]
    at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54) ~[spring-context-5.3.23.jar:5.3.23]
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[na:na]
    at java.base/java.util.concurrent.FutureTask.runAndReset$$$capture(FutureTask.java:305) ~[na:na]
    at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java) ~[na:na]
    at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) ~[na:na]
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[na:na]
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[na:na]
    at java.base/java.lang.Thread.run(Thread.java:833) ~[na:na]
Caused by: javax.jms.IllegalStateException: The MessageProducer was closed due to an unrecoverable error.
    at org.apache.qpid.jms.JmsMessageProducer.checkClosed(JmsMessageProducer.java:324) ~[qpid-jms-client-0.53.0.jar:na]
    at org.apache.qpid.jms.JmsMessageProducer.getDeliveryMode(JmsMessageProducer.java:135) ~[qpid-jms-client-0.53.0.jar:na]
    at org.springframework.jms.connection.CachedMessageProducer.<init>(CachedMessageProducer.java:60) ~[spring-jms-5.3.23.jar:5.3.23]
    at org.springframework.jms.connection.CachingConnectionFactory$CachedSessionInvocationHandler.getCachedProducer(CachingConnectionFactory.java:434) ~[spring-jms-5.3.23.jar:5.3.23]
    at org.springframework.jms.connection.CachingConnectionFactory$CachedSessionInvocationHandler.invoke(CachingConnectionFactory.java:363) ~[spring-jms-5.3.23.jar:5.3.23]
    at jdk.proxy2/jdk.proxy2.$Proxy66.createProducer(Unknown Source) ~[na:na]
    at org.springframework.jms.core.JmsTemplate.doCreateProducer(JmsTemplate.java:1115) ~[spring-jms-5.3.23.jar:5.3.23]
    at org.springframework.jms.core.JmsTemplate.createProducer(JmsTemplate.java:1096) ~[spring-jms-5.3.23.jar:5.3.23]
    at org.springframework.jms.core.JmsTemplate.doSend(JmsTemplate.java:602) ~[spring-jms-5.3.23.jar:5.3.23]
    at org.springframework.jms.core.JmsTemplate.lambda$send$3(JmsTemplate.java:586) ~[spring-jms-5.3.23.jar:5.3.23]
    at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:504) ~[spring-jms-5.3.23.jar:5.3.23]
    ... 17 common frames omitted
Caused by: org.apache.qpid.jms.provider.ProviderException: The link 'G0:36906660:qpid-jms:sender:azure:5caf3ef4-9602-413c-964d-cf1292d6e1f5:1:1:1:t4' is force detached. Code: publisher(link376). Details: AmqpMessagePublisher.IdleTimerExpired: Idle timeout: 00:10:00. [condition = amqp:link:detach-forced]
    at org.apache.qpid.jms.provider.amqp.AmqpSupport.convertToNonFatalException(AmqpSupport.java:181) ~[qpid-jms-client-0.53.0.jar:na]
    at org.apache.qpid.jms.provider.amqp.AmqpAbstractResource.processRemoteClose(AmqpAbstractResource.java:269) ~[qpid-jms-client-0.53.0.jar:na]
    at org.apache.qpid.jms.provider.amqp.AmqpProvider.processUpdates(AmqpProvider.java:985) ~[qpid-jms-client-0.53.0.jar:na]
    at org.apache.qpid.jms.provider.amqp.AmqpProvider.onData(AmqpProvider.java:871) ~[qpid-jms-client-0.53.0.jar:na]
    at org.apache.qpid.jms.transports.netty.NettyTcpTransport$NettyTcpTransportHandler.channelRead0(NettyTcpTransport.java:563) ~[qpid-jms-client-0.53.0.jar:na]
    at org.apache.qpid.jms.transports.netty.NettyTcpTransport$NettyTcpTransportHandler.channelRead0(NettyTcpTransport.java:556) ~[qpid-jms-client-0.53.0.jar:na]
    at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1373) ~[netty-handler-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1236) ~[netty-handler-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1285) ~[netty-handler-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:519) ~[netty-codec-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:458) ~[netty-codec-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:280) ~[netty-codec-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[netty-common-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[netty-common-4.1.84.Final.jar:4.1.84.Final]
    ... 1 common frames omitted

2022-11-06 11:12:06.021 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: End{error=null}
2022-11-06 11:12:06.022 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: SESSION_REMOTE_CLOSE
2022-11-06 11:12:06.022 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] SENT: End{error=null}
2022-11-06 11:12:06.023  INFO 25944 --- [windows.net:-1]] c.e.d.j.p.s.AzureServiceBusProvider      : Resource closed JmsSessionInfo { azure:5caf3ef4-9602-413c-964d-cf1292d6e1f5:1:-1 }
2022-11-06 11:12:06.023 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: SESSION_LOCAL_CLOSE
2022-11-06 11:12:06.023 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: SESSION_FINAL
2022-11-06 11:12:06.077 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:1] RECV: End{error=null}
2022-11-06 11:12:06.077 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] RECV: Close{error=Error{condition=amqp:connection:forced, description='The connection was inactive for more than the allowed 300000 milliseconds and is closed by container 'LinkTracker'. TrackingId:07bc72c10fde44afb67c5f833eeb5e41_G0, SystemTracker:gateway5, Timestamp:2022-11-06T03:12:06', info=null}}
2022-11-06 11:12:06.078 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: SESSION_REMOTE_CLOSE
2022-11-06 11:12:06.078 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:1] SENT: End{error=null}
2022-11-06 11:12:06.079  INFO 25944 --- [windows.net:-1]] c.e.d.j.p.s.AzureServiceBusProvider      : Resource closed JmsSessionInfo { azure:5caf3ef4-9602-413c-964d-cf1292d6e1f5:1:1 }
2022-11-06 11:12:06.079 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: CONNECTION_REMOTE_CLOSE
2022-11-06 11:12:06.081 TRACE 25944 --- [windows.net:-1]] o.apache.qpid.jms.provider.amqp.FRAMES   : [492364852:0] SENT: Close{error=null}
2022-11-06 11:12:06.085  INFO 25944 --- [windows.net:-1]] c.e.d.j.p.s.AzureServiceBusProvider      : Connection failure The connection was inactive for more than the allowed 300000 milliseconds and is closed by container 'LinkTracker'. TrackingId:07bc72c10fde44afb67c5f833eeb5e41_G0, SystemTracker:gateway5, Timestamp:2022-11-06T03:12:06 [condition = amqp:connection:forced]
2022-11-06 11:12:06.085 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: TRANSPORT_TAIL_CLOSED
2022-11-06 11:12:06.085 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: SESSION_LOCAL_CLOSE
2022-11-06 11:12:06.085 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: SESSION_FINAL
2022-11-06 11:12:06.085 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: CONNECTION_LOCAL_CLOSE
2022-11-06 11:12:06.085 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: TRANSPORT_HEAD_CLOSED
2022-11-06 11:12:06.086 TRACE 25944 --- [windows.net:-1]] o.a.qpid.jms.provider.amqp.AmqpProvider  : New Proton Event: TRANSPORT_CLOSED
2022-11-06 11:12:06.085  INFO 25944 --- [-cf1292d6e1f5:1] o.s.j.c.CachingConnectionFactory         : Encountered a JMSException - resetting the underlying JMS Connection

org.apache.qpid.jms.JmsConnectionRemotelyClosedException: The connection was inactive for more than the allowed 300000 milliseconds and is closed by container 'LinkTracker'. TrackingId:07bc72c10fde44afb67c5f833eeb5e41_G0, SystemTracker:gateway5, Timestamp:2022-11-06T03:12:06 [condition = amqp:connection:forced]
    at org.apache.qpid.jms.provider.exceptions.ProviderConnectionRemotelyClosedException.toJMSException(ProviderConnectionRemotelyClosedException.java:37) ~[qpid-jms-client-0.53.0.jar:na]
    at org.apache.qpid.jms.exceptions.JmsExceptionSupport.create(JmsExceptionSupport.java:80) ~[qpid-jms-client-0.53.0.jar:na]
    at org.apache.qpid.jms.exceptions.JmsExceptionSupport.create(JmsExceptionSupport.java:112) ~[qpid-jms-client-0.53.0.jar:na]
    at org.apache.qpid.jms.JmsConnection.onAsyncException(JmsConnection.java:1546) ~[qpid-jms-client-0.53.0.jar:na]
    at org.apache.qpid.jms.JmsConnection.onProviderException(JmsConnection.java:1530) ~[qpid-jms-client-0.53.0.jar:na]
    at org.apache.qpid.jms.JmsConnection.onConnectionFailure(JmsConnection.java:1374) ~[qpid-jms-client-0.53.0.jar:na]
    at org.apache.qpid.jms.provider.ProviderWrapper.onConnectionFailure(ProviderWrapper.java:207) ~[qpid-jms-client-0.53.0.jar:na]
    at com.example.demo.jms.provider.servicebus.AzureServiceBusProvider.onConnectionFailure(AzureServiceBusProvider.java:68) ~[classes/:na]
    at org.apache.qpid.jms.provider.amqp.AmqpProvider.fireProviderException(AmqpProvider.java:1150) ~[qpid-jms-client-0.53.0.jar:na]
    at org.apache.qpid.jms.provider.amqp.AmqpAbstractResource.closeResource(AmqpAbstractResource.java:185) ~[qpid-jms-client-0.53.0.jar:na]
    at org.apache.qpid.jms.provider.amqp.AmqpConnection.processRemoteClose(AmqpConnection.java:151) ~[qpid-jms-client-0.53.0.jar:na]
    at org.apache.qpid.jms.provider.amqp.AmqpProvider.processUpdates(AmqpProvider.java:961) ~[qpid-jms-client-0.53.0.jar:na]
    at org.apache.qpid.jms.provider.amqp.AmqpProvider.onData(AmqpProvider.java:871) ~[qpid-jms-client-0.53.0.jar:na]
    at org.apache.qpid.jms.transports.netty.NettyTcpTransport$NettyTcpTransportHandler.channelRead0(NettyTcpTransport.java:563) ~[qpid-jms-client-0.53.0.jar:na]
    at org.apache.qpid.jms.transports.netty.NettyTcpTransport$NettyTcpTransportHandler.channelRead0(NettyTcpTransport.java:556) ~[qpid-jms-client-0.53.0.jar:na]
    at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1373) ~[netty-handler-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1236) ~[netty-handler-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1285) ~[netty-handler-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:519) ~[netty-codec-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:458) ~[netty-codec-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:280) ~[netty-codec-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562) ~[netty-transport-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[netty-common-4.1.84.Final.jar:4.1.84.Final]
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[netty-common-4.1.84.Final.jar:4.1.84.Final]
    at java.base/java.lang.Thread.run(Thread.java:833) ~[na:na]
Caused by: org.apache.qpid.jms.provider.exceptions.ProviderConnectionRemotelyClosedException: The connection was inactive for more than the allowed 300000 milliseconds and is closed by container 'LinkTracker'. TrackingId:07bc72c10fde44afb67c5f833eeb5e41_G0, SystemTracker:gateway5, Timestamp:2022-11-06T03:12:06 [condition = amqp:connection:forced]
    at org.apache.qpid.jms.provider.amqp.AmqpSupport.convertToConnectionClosedException(AmqpSupport.java:128) ~[qpid-jms-client-0.53.0.jar:na]
    at org.apache.qpid.jms.provider.amqp.AmqpConnection.processRemoteClose(AmqpConnection.java:148) ~[qpid-jms-client-0.53.0.jar:na]
    ... 29 common frames omitted

2022-11-06 11:12:06.088 DEBUG 25944 --- [-cf1292d6e1f5:1] o.apache.qpid.jms.util.ThreadPoolUtils   : Shutdown of ExecutorService: java.util.concurrent.ThreadPoolExecutor@1f13b57f[Shutting down, pool size = 1, active threads = 1, queued tasks = 1, completed tasks = 4] is shutdown: true and terminated: false took: 0.000 seconds.
stliu commented 1 year ago

Accoring to this doc https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-amqp-troubleshoot#connection-is-closed

an AMQP Link will be closed after 10 minutes idle

You see the following error when the AMQP connection and link are active but no calls (for example, send or receive) are made using the link for 10 minutes. So, the link is closed. The connection is still open.

amqp:link:detach-forced:The link 'G2:7223832:user.tenant0.cud_00000000000-0000-0000-0000-00000000000000' is force detached by the broker due to errors occurred in publisher(link164614). Detach origin: AmqpMessagePublisher.IdleTimerExpired: Idle timeout: 00:10:00. TrackingId:00000000000000000000000000000000000000_G2_B3, SystemTracker:mynamespace:Topic:MyTopic, Timestamp:2/16/2018 11:10:40 PM

an AMQP Connection will be closed after all links are closed

You see the following error on the AMQP connection when all links in the connection have been closed because there was no activity (idle) and a new link has not been created in 5 minutes.

Error{condition=amqp:connection:forced, description='The connection was inactive for more than the allowed 300000 milliseconds and is closed by container 'LinkTracker'. TrackingId:00000000000000000000000000000000000_G21, SystemTracker:gateway5, Timestamp:2019-03-06T17:32:00', info=null}

And the last one

You see this error when a new AMQP connection is created but a link is not created within 1 minute of the creation of the AMQP Connection.

Error{condition=amqp:connection:forced, description='The connection was inactive for more than the allowed 60000 milliseconds and is closed by container 'LinkTracker'. TrackingId:0000000000000000000000000000000000000_G21, SystemTracker:gateway5, Timestamp:2019-03-06T18:41:51', info=null}

To summary

  1. An AMQP Connection is a socket connection ( TCP or WebSocket ) between an application and Azure Service Bus
  2. An AMQP connection can have 0-n links
  3. A link w/o any activity of more than 10 minutes will be closed by Azure Service Bus
  4. A Connection w/o an active Link for more than 5 minutes ( or 1 minute for a newly created connection ) will be closed by Azure Service Bus

So, for a long-running application which is very busy at daytime ( 1000 message per second ) and very low traffic at night ( 1 message per hour ), what is the strategy here?

first and foremost, whenever that message comes, template.send(message) must success and the message is delivered to the Azure Service Bus.

so, it leaves us with two options:

  1. we keep the link and/or connection open and active, probably by sending some heartbeat message to Service Bus in a 10 minutes at most period
  2. we accept the server's choice and let it close the idle link and connection, but open a new one to deliver the message when it comes, but in this case, when the server close the link / connection, it should NOT be an error log to the end user, since by selecting this strategy, a closed connection is not an error anymore.
stliu commented 1 year ago

AMQP protocol tracing analysis

  1. establish connection
2022-11-06 10:57:03.609 SENT: SASL
2022-11-06 10:57:03.875 RECV: SASL
2022-11-06 10:57:03.877 RECV: SaslMechanisms{saslServerMechanisms=[MSSBCBS, PLAIN, ANONYMOUS, EXTERNAL]}
2022-11-06 10:57:03.897 SENT: SaslInit{mechanism=PLAIN, initialResponse=\x00RootManageSharedAccessKey\x00+GtCmBe/x4v6I5/mxEHKv6JxVrRu0PrKo3UjN1csjYg=, hostname='shaozliu.servicebus.windows.net'}
2022-11-06 10:57:04.165 RECV: SaslOutcome{_code=OK, _additionalData=Welcome!}
2022-11-06 10:57:04.167 SENT: AMQP
2022-11-06 10:57:04.190 SENT: Open{ containerId='azure:2657a071-8d5d-427a-86f9-d6e76ffa4a5e:1', hostname='shaozliu.servicebus.windows.net', maxFrameSize=1048576, channelMax=32767, idleTimeOut=50000, outgoingLocales=null, incomingLocales=null, offeredCapabilities=null, desiredCapabilities=[sole-connection-for-container, DELAYED_DELIVERY, ANONYMOUS-RELAY, SHARED-SUBS], properties={com.microsoft:is-client-provider=true, product=QpidJMS, version=0.53.0, platform=JVM: 17.0.4.1, 17.0.4.1+1-LTS, Microsoft, OS: Mac OS X, 12.6.1, x86_64}}
2022-11-06 10:57:04.434 RECV: AMQP
2022-11-06 10:57:04.459 RECV: Open{ containerId='07bc72c10fde44afb67c5f833eeb5e41_G0', hostname='null', maxFrameSize=65536, channelMax=4999, idleTimeOut=120000, outgoingLocales=null, incomingLocales=null, offeredCapabilities=null, desiredCapabilities=null, properties=null}
2022-11-06 10:57:04.468 SENT: Begin{remoteChannel=null, nextOutgoingId=1, incomingWindow=2047, outgoingWindow=2147483647, handleMax=65535, offeredCapabilities=null, desiredCapabilities=null, properties=null}
2022-11-06 10:57:04.735 RECV: Begin{remoteChannel=0, nextOutgoingId=1, incomingWindow=5000, outgoingWindow=2047, handleMax=255, offeredCapabilities=null, desiredCapabilities=null, properties=null}
2022-11-06 10:57:04.751 [492364852:1] SENT: Begin{remoteChannel=null, nextOutgoingId=1, incomingWindow=2047, outgoingWindow=2147483647, handleMax=65535, offeredCapabilities=null, desiredCapabilities=null, properties=null}
2022-11-06 10:57:05.064 [492364852:1] RECV: Begin{remoteChannel=1, nextOutgoingId=1, incomingWindow=5000, outgoingWindow=2047, handleMax=255, offeredCapabilities=null, desiredCapabilities=null, properties=null}
2022-11-06 10:57:05.085 [492364852:1] SENT: Attach{name='qpid-jms:sender:azure:5caf3ef4-9602-413c-964d-cf1292d6e1f5:1:1:1:t4', handle=0, role=SENDER, sndSettleMode=UNSETTLED, rcvSettleMode=FIRST, source=Source{address='azure:5caf3ef4-9602-413c-964d-cf1292d6e1f5:1:1:1', durable=NONE, expiryPolicy=SESSION_END, timeout=0, dynamic=false, dynamicNodeProperties=null, distributionMode=null, filter=null, defaultOutcome=null, outcomes=[amqp:accepted:list, amqp:rejected:list, amqp:released:list, amqp:modified:list], capabilities=null}, target=Target{address='t4', durable=NONE, expiryPolicy=SESSION_END, timeout=0, dynamic=false, dynamicNodeProperties=null, capabilities=[queue]}, unsettled=null, incompleteUnsettled=false, initialDeliveryCount=0, maxMessageSize=null, offeredCapabilities=null, desiredCapabilities=[DELAYED_DELIVERY], properties=null}
2022-11-06 10:57:05.377 [492364852:1] RECV: Attach{name='qpid-jms:sender:azure:5caf3ef4-9602-413c-964d-cf1292d6e1f5:1:1:1:t4', handle=0, role=RECEIVER, sndSettleMode=UNSETTLED, rcvSettleMode=FIRST, source=Source{address='azure:5caf3ef4-9602-413c-964d-cf1292d6e1f5:1:1:1', durable=NONE, expiryPolicy=SESSION_END, timeout=0, dynamic=false, dynamicNodeProperties=null, distributionMode=null, filter=null, defaultOutcome=null, outcomes=[amqp:accepted:list, amqp:rejected:list, amqp:released:list, amqp:modified:list], capabilities=null}, target=Target{address='t4', durable=NONE, expiryPolicy=SESSION_END, timeout=0, dynamic=false, dynamicNodeProperties=null, capabilities=[queue]}, unsettled=null, incompleteUnsettled=false, initialDeliveryCount=null, maxMessageSize=1048576, offeredCapabilities=[DELAYED_DELIVERY], desiredCapabilities=null, properties=null}
  1. start to send message (transfer)
2022-11-06 10:57:05.456 [492364852:1] SENT: Transfer{handle=0, deliveryId=0, deliveryTag=\x00, messageFormat=0, settled=false, more=false, rcvSettleMode=null, state=null, resume=false, aborted=false, batchable=false} (228) "\x00Sp\xc0\x02\x01A\x00Sr\xc1)\x04\xa3\x0ex-opt-jms-destQ\x00\xa3\x12x-opt-jms-msg-typeQ\x05\x00Ss\xd0\x00\x00\x00O\x00\x00\x00\x0a\xa15ID:azure:5caf3ef4-9602-413c-964d-cf1292d6e1f5:1:1:1-1@\xa1\x02t4@@@@@@\x83\x00\x00\x01\x84J\xde\xc9\xa2\x00St\xc1\x1f\x02\xa1\x05_type\xa1\x15com.example.demo.User\x00Sw\xa1/{"name":"2022-11-06T10:57:01.630090 message 0"}"
2022-11-06 10:57:05.788 [492364852:1] RECV: Disposition{role=RECEIVER, first=0, last=null, settled=true, state=Accepted{}, batchable=false}
  1. about 10 minutes idle since the first message sending, now we get the detach command from server to close the link, according to this https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-amqp-troubleshoot#link-is-closed

    2022-11-06 11:07:05.760 [492364852:1] RECV: Detach{handle=0, closed=true, error=Error{condition=amqp:link:detach-forced, description='The link 'G0:36906660:qpid-jms:sender:azure:5caf3ef4-9602-413c-964d-cf1292d6e1f5:1:1:1:t4' is force detached. Code: publisher(link376). Details: AmqpMessagePublisher.IdleTimerExpired: Idle timeout: 00:10:00.', info=null}}
    2022-11-06 11:07:05.769 [492364852:1] SENT: Detach{handle=0, closed=true, error=null}
  2. since we only had 1 link and it got closed 5 minutes ago, so we get another command to close the connection according to this https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-amqp-troubleshoot#connection-is-closed

2022-11-06 11:12:06.077 [492364852:1] RECV: End{error=null}
2022-11-06 11:12:06.077 RECV: Close{error=Error{condition=amqp:connection:forced, description='The connection was inactive for more than the allowed 300000 milliseconds and is closed by container 'LinkTracker'. TrackingId:07bc72c10fde44afb67c5f833eeb5e41_G0, SystemTracker:gateway5, Timestamp:2022-11-06T03:12:06', info=null}}
2022-11-06 11:12:06.078 [492364852:1] SENT: End{error=null}
2022-11-06 11:12:06.081 SENT: Close{error=null}

Full AMQP tracing can be found here 8e31f0b3a82e20f391312a622083d9d5

yiliuTo commented 1 year ago

After some internal discussion, @shankarsama would consider to adding an AMQP property for making the link timeout configurable, which we can leverage to fix this issue. And @shankarsama , would you help to update here when that configuration is released in the service side?

zoladkow commented 1 year ago

But @yiliuTo, your proposal will NOT FIX the issue.

In https://github.com/microsoft/azure-spring-boot/issues/817#issuecomment-1306509890 you've clearly stated that here you're gonna provide a fix, not a workaround.

While the option to configure EFFECTIVE timeout (which ServiceBus will respect for a change), is a nice feature, prolonging or disabling timeout WILL NOT FIX the underlying issue that JmsTemplate provided by your starter cannot recover from obviously recoverable exceptions.

Suggestion: leave the timeouts be, just make the JmsTemplate able to recover internally when a known situation like this happens. I will repeat myself, but your other library azure-servicebus has no problem recovering from timeouts - client instance created once, just works when needed. This is really all that everyone expects and wants... OK, maybe 5s on initial send is also a bit too much

yiliuTo commented 1 year ago

Suggestion: leave the timeouts be, just make the JmsTemplate able to recover internally when a known situation like this happens.

@zoladkow , thanks for your suggestion, we will consider about it.

nescadajci commented 1 year ago

any update on this issue?

Netyyyy commented 1 year ago

any update on this issue?

No updates at the moment. Rest assured, we are actively try to resolve it and will keep you informed if there are any developments.

neffsvg commented 12 months ago

@Netyyyy @saragluna can you update this issue? How is it going? The only way to get around it at the moment is the "connection pool" workaround, right? Why is this such a big issue?

Netyyyy commented 12 months ago

@Netyyyy @saragluna can you update this issue? How is it going? The only way to get around it at the moment is the "connection pool" workaround, right? Why is this such a big issue?

Hi @neffsvg , we are still working on this issue. @vinaysurya please help take a look.

neffsvg commented 11 months ago

@Netyyyy @vinaysurya we are having a go live deadline in a view weeks. And QA needs to check everything first. So high pressure on us.

Is there a pr or anything we can track the progress? Given that the last updates were May 26 and 2 days ago. Or is there something we can potentially help with? I still do not understand why this would take so long, maybe I am missing something.

@zoladkow are you still using this? Maybe we can join forces with them to solve this.

zoladkow commented 11 months ago

@zoladkow are you still using this? Maybe we can join forces with them to solve this.

@neffsvg Not anymore, we spotted the issue very early (and long ago) and decided instead go with plain azure-servicebus library. Granted, it did not provide all the @JmsListener or JmsTemplate Spring stuff, but that's absolutely not a problem. The most important thing was, that Queue/TopicClient would not go into invalid state after idle disconnect, just handle this internally like common sense would dictate.

Oh, and this issue is by far not new. You can check the previous one here: https://github.com/microsoft/azure-spring-boot/issues/817

Given that only real solution would require to redo this starter on top of azure-servicebus (or whatever they changed the name too - yeah, naming convention changes, top priority, eh...), which would require to actually provide adapters to work with Spring JMS, i can see why they are so reluctant - the gain seems awfully too small to justify such an effort. Volunteer effort especially. Also, most likely for serverless scenarios this is never an issue, since the message is sent and function ends (unless they would "optimize" by keeping the function instance long enough for the timeout to become a problem there too...)

Saljack commented 11 months ago

@neffsvg I would recommend you hands off from Azure Service Bus and mainly with the JMS combination. It is really nightmare.

We have other issues with the azure-servicebus library which are not resolved for long time. For example #33688

vinaysurya commented 9 months ago

This issue should be addressed for now. We have made a change on ServiceBus service side to not enforce the expiring of idle links aggressively for JMS customers that come through azure-servicebus-jms libraries (which is the dependent library used by spring-cloud-azure-starter-servicebus-jms, version 4.x ). This is only available for premium messaging namespaces. There are still active quota enforcements on the number of active links that can be had for a namespace at any given point in time.

The real long term fix will likely need a fix involving Qpid JMS library, where a producer object on client is not immediately considered close on receiving a link close. Instead when using the jmsproducer object, the qpid jms client library has to check if the underlying link is in closed state and if so re-create the underneath amqp link again.

vinaysurya commented 7 months ago

Please note that the fix on the ServiceBus service side is in the process of deployment. We expect the fix to be deployed in about 3 weeks from today across all ServiceBus clusters.

dmuley17 commented 7 months ago

I had the same issue and I fixed it by enabling pool in spring boot

spring:
  jms:
    servicebus:
      connection-string: Endpoint=sb://*******w=
      pricing-tier: standard
      pool:
        enabled: true

More information

https://learn.microsoft.com/en-us/azure/developer/java/spring-framework/spring-jms-troubleshooting-guide#the-messageproducer-was-closed-due-to-an-unrecoverable-error