Azure / azure-sdk-for-java

This repository is for active development of the Azure SDK for Java. For consumers of the SDK we recommend visiting our public developer docs at https://docs.microsoft.com/java/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-java.
MIT License
2.36k stars 2k forks source link

[BUG] We sometimes (incorrectly) get "Error sending. Size of the payload exceeded maximum message size" error when scheduling a message #41605

Closed haraldk closed 2 months ago

haraldk commented 3 months ago

Describe the bug We sometimes (I believe incorrectly) get "Error sending. Size of the payload exceeded maximum message size: 256 kb" error when scheduling a message. The message is a copy of a received message, with a single added client property to control the number of times it is re-scheduled. The error message does not make sense to me, as the message was previously sent fine, and the full message body is at most a few hundred bytes (< 1 kb). It is nowhere near the 256 kb limit.

The scheduling (usually) also works after a few retries.

The cause of the exception, however, says "java.lang.IndexOutOfBoundsException: Requested min remaining bytes(155) exceeds remaining(137) in underlying ByteBuffer: java.nio.HeapByteBuffer[pos=636 lim=773 cap=773]". That makes more sense, but I don't know how/why the limit is set to 773 bytes, as that is outside the control of the client code.

Exception or Stack Trace

com.azure.messaging.servicebus.ServiceBusException: Error sending. Size of the payload exceeded maximum message size: 256 kb, errorContext[NAMESPACE: sbn-lydteamet-lydbanken-prod-azeno-1.servicebus.windows.net. ERROR CONTEXT: N/A, PATH: lydbanken-integrasjoner-virtuoso-updater/$management, REFERENCE_ID: lydbanken-integrasjoner-virtuoso-updater-mgmt:receiver, LINK_CREDIT: 0]
    at com.azure.messaging.servicebus.MessagePump.notifyMessage(MessagePump.java:159)
    at com.azure.messaging.servicebus.MessagePump.lambda$handleMessage$3(MessagePump.java:142)
    at com.azure.messaging.servicebus.implementation.instrumentation.ServiceBusReceiverInstrumentation.wrap(ServiceBusReceiverInstrumentation.java:169)
    at com.azure.messaging.servicebus.implementation.instrumentation.ServiceBusReceiverInstrumentation.instrumentProcess(ServiceBusReceiverInstrumentation.java:104)
    at com.azure.messaging.servicebus.MessagePump.handleMessage(MessagePump.java:135)
    at com.azure.messaging.servicebus.MessagePump$RunOnWorker.lambda$apply$0(MessagePump.java:212)
    at reactor.core.publisher.MonoRunnable.call(MonoRunnable.java:73)
    at reactor.core.publisher.MonoRunnable.call(MonoRunnable.java:32)
    at reactor.core.publisher.FluxSubscribeOnCallable$CallableSubscribeOnSubscription.run(FluxSubscribeOnCallable.java:228)
    at reactor.core.scheduler.ImmediateScheduler.schedule(ImmediateScheduler.java:52)
    at reactor.core.publisher.MonoSubscribeOnCallable.subscribe(MonoSubscribeOnCallable.java:52)
    at reactor.core.publisher.Mono.subscribe(Mono.java:4568)
    at reactor.core.publisher.FluxFlatMap$FlatMapMain.onNext(FluxFlatMap.java:430)
    at com.azure.messaging.servicebus.TracingFluxOperator.lambda$hookOnNext$0(TracingFluxOperator.java:67)
    at com.azure.messaging.servicebus.implementation.instrumentation.ServiceBusReceiverInstrumentation.instrumentProcess(ServiceBusReceiverInstrumentation.java:97)
    at com.azure.messaging.servicebus.TracingFluxOperator$1.lambda$subscribe$0(TracingFluxOperator.java:39)
    at com.azure.messaging.servicebus.TracingFluxOperator.hookOnNext(TracingFluxOperator.java:66)
    at reactor.core.publisher.BaseSubscriber.onNext(BaseSubscriber.java:160)
    at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:122)
    at com.azure.core.amqp.implementation.MessageFlux$RecoverableReactorReceiver.drainLoop(MessageFlux.java:476)
    at com.azure.core.amqp.implementation.MessageFlux$RecoverableReactorReceiver.drain(MessageFlux.java:405)
    at com.azure.core.amqp.implementation.MessageFlux$ReactorReceiverMediator.onNext(MessageFlux.java:878)
    at com.azure.core.amqp.implementation.MessageFlux$ReactorReceiverMediator.onNext(MessageFlux.java:722)
    at reactor.core.publisher.FluxPublishOn$PublishOnSubscriber.runAsync(FluxPublishOn.java:446)
    at reactor.core.publisher.FluxPublishOn$PublishOnSubscriber.run(FluxPublishOn.java:533)
    at reactor.core.scheduler.WorkerTask.call(WorkerTask.java:84)
    at reactor.core.scheduler.WorkerTask.call(WorkerTask.java:37)
    at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
    at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.base/java.lang.Thread.run(Unknown Source)
Caused by: com.azure.messaging.servicebus.ServiceBusException: Error sending. Size of the payload exceeded maximum message size: 256 kb, errorContext[NAMESPACE: sbn-lydteamet-lydbanken-prod-azeno-1.servicebus.windows.net. ERROR CONTEXT: N/A, PATH: lydbanken-integrasjoner-virtuoso-updater/$management, REFERENCE_ID: lydbanken-integrasjoner-virtuoso-updater-mgmt:receiver, LINK_CREDIT: 0]
    at com.azure.messaging.servicebus.ServiceBusSenderAsyncClient.mapError(ServiceBusSenderAsyncClient.java:950)
    at reactor.core.publisher.Mono.lambda$onErrorMap$28(Mono.java:3854)
    at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:94)
    at reactor.core.publisher.MonoFlatMap$FlatMapMain.secondError(MonoFlatMap.java:241)
    at reactor.core.publisher.MonoFlatMap$FlatMapInner.onError(MonoFlatMap.java:315)
    at reactor.core.publisher.MonoFlatMap$FlatMapMain.secondError(MonoFlatMap.java:241)
    at reactor.core.publisher.MonoFlatMap$FlatMapInner.onError(MonoFlatMap.java:315)
    at reactor.core.publisher.MonoFlatMap$FlatMapMain.secondError(MonoFlatMap.java:241)
    at reactor.core.publisher.MonoFlatMap$FlatMapInner.onError(MonoFlatMap.java:315)
    at reactor.core.publisher.MonoNext$NextSubscriber.onError(MonoNext.java:93)
    at reactor.core.publisher.FluxConcatArray$ConcatArraySubscriber.onError(FluxConcatArray.java:186)
    at reactor.core.publisher.MonoFlatMapMany$FlatMapManyMain.onError(MonoFlatMapMany.java:205)
    at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:149)
    at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1865)
    at com.azure.core.amqp.implementation.AmqpChannelProcessor.subscribe(AmqpChannelProcessor.java:283)
    at reactor.core.publisher.Flux.subscribe(Flux.java:8840)
    at reactor.core.publisher.FluxConcatArray$ConcatArraySubscriber.onComplete(FluxConcatArray.java:238)
    at reactor.core.publisher.FluxConcatArray.subscribe(FluxConcatArray.java:79)
    at reactor.core.publisher.MonoFromFluxOperator.subscribe(MonoFromFluxOperator.java:83)
    at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:165)
    at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:158)
    at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:122)
    at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1865)
    at reactor.core.publisher.MonoCacheInvalidateIf.subscribeOrReturn(MonoCacheInvalidateIf.java:172)
    at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:63)
    at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:165)
    at reactor.core.publisher.Operators$ScalarSubscription.request(Operators.java:2571)
    at reactor.core.publisher.MonoFlatMap$FlatMapMain.request(MonoFlatMap.java:194)
    at reactor.core.publisher.MonoFlatMap$FlatMapInner.onSubscribe(MonoFlatMap.java:291)
    at reactor.core.publisher.MonoFlatMap$FlatMapMain.onSubscribe(MonoFlatMap.java:117)
    at reactor.core.publisher.MonoJust.subscribe(MonoJust.java:55)
    at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:53)
    at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:76)
    at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:165)
    at reactor.core.publisher.FluxPeekFuseable$PeekFuseableSubscriber.onNext(FluxPeekFuseable.java:210)
    at reactor.core.publisher.MonoFlatMap$FlatMapMain.secondComplete(MonoFlatMap.java:245)
    at reactor.core.publisher.MonoFlatMap$FlatMapInner.onNext(MonoFlatMap.java:305)
    at reactor.core.publisher.MonoFlatMap$FlatMapMain.secondComplete(MonoFlatMap.java:245)
    at reactor.core.publisher.MonoFlatMap$FlatMapInner.onNext(MonoFlatMap.java:305)
    at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:129)
    at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:129)
    at reactor.core.publisher.Operators$ScalarSubscription.request(Operators.java:2571)
    at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.request(FluxMapFuseable.java:171)
    at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.request(FluxMapFuseable.java:171)
    at reactor.core.publisher.MonoFlatMap$FlatMapInner.onSubscribe(MonoFlatMap.java:291)
    at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onSubscribe(FluxMapFuseable.java:96)
    at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onSubscribe(FluxMapFuseable.java:96)
    at reactor.core.publisher.MonoJust.subscribe(MonoJust.java:55)
    at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:76)
    at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:165)
    at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:129)
    at reactor.core.publisher.MonoFlatMap$FlatMapMain.secondComplete(MonoFlatMap.java:245)
    at reactor.core.publisher.MonoFlatMap$FlatMapInner.onNext(MonoFlatMap.java:305)
    at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.complete(MonoIgnoreThen.java:294)
    at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.onNext(MonoIgnoreThen.java:188)
    at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.subscribeNext(MonoIgnoreThen.java:237)
    at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.onComplete(MonoIgnoreThen.java:204)
    at reactor.core.publisher.MonoPeekTerminal$MonoTerminalPeekSubscriber.onComplete(MonoPeekTerminal.java:299)
    at reactor.core.publisher.SerializedSubscriber.onComplete(SerializedSubscriber.java:146)
    at reactor.core.publisher.FluxTimeout$TimeoutMainSubscriber.onComplete(FluxTimeout.java:235)
    at reactor.core.publisher.MonoNext$NextSubscriber.onComplete(MonoNext.java:102)
    at reactor.core.publisher.MonoNext$NextSubscriber.onNext(MonoNext.java:83)
    at reactor.core.publisher.FluxFilterFuseable$FilterFuseableSubscriber.onNext(FluxFilterFuseable.java:118)
    at reactor.core.publisher.FluxReplay$SizeBoundReplayBuffer.replayNormal(FluxReplay.java:877)
    at reactor.core.publisher.FluxReplay$SizeBoundReplayBuffer.replay(FluxReplay.java:965)
    at reactor.core.publisher.FluxReplay$ReplayInner.request(FluxReplay.java:1711)
    at reactor.core.publisher.FluxFilterFuseable$FilterFuseableSubscriber.request(FluxFilterFuseable.java:191)
    at reactor.core.publisher.MonoNext$NextSubscriber.request(MonoNext.java:108)
    at reactor.core.publisher.Operators$MultiSubscriptionSubscriber.request(Operators.java:2331)
    at reactor.core.publisher.SerializedSubscriber.request(SerializedSubscriber.java:151)
    at reactor.core.publisher.MonoPeekTerminal$MonoTerminalPeekSubscriber.request(MonoPeekTerminal.java:139)
    at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.onSubscribe(MonoIgnoreThen.java:135)
    at reactor.core.publisher.MonoPeekTerminal$MonoTerminalPeekSubscriber.onSubscribe(MonoPeekTerminal.java:152)
    at reactor.core.publisher.SerializedSubscriber.onSubscribe(SerializedSubscriber.java:65)
    at reactor.core.publisher.FluxTimeout$TimeoutMainSubscriber.onSubscribe(FluxTimeout.java:155)
    at reactor.core.publisher.MonoNext$NextSubscriber.onSubscribe(MonoNext.java:70)
    at reactor.core.publisher.FluxFilterFuseable$FilterFuseableSubscriber.onSubscribe(FluxFilterFuseable.java:87)
    at reactor.core.publisher.FluxReplay.subscribeOrReturn(FluxReplay.java:1181)
    at reactor.core.publisher.FluxReplay.subscribe(FluxReplay.java:1153)
    at reactor.core.publisher.FluxAutoConnectFuseable.subscribe(FluxAutoConnectFuseable.java:61)
    at reactor.core.publisher.Mono.subscribe(Mono.java:4568)
    at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.subscribeNext(MonoIgnoreThen.java:265)
    at reactor.core.publisher.MonoIgnoreThen.subscribe(MonoIgnoreThen.java:51)
    at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:165)
    at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:129)
    at reactor.core.publisher.MonoPeekTerminal$MonoTerminalPeekSubscriber.onNext(MonoPeekTerminal.java:180)
    at reactor.core.publisher.FluxHide$SuppressFuseableSubscriber.onNext(FluxHide.java:137)
    at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.complete(MonoIgnoreThen.java:294)
    at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.onNext(MonoIgnoreThen.java:188)
    at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.subscribeNext(MonoIgnoreThen.java:237)
    at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.onComplete(MonoIgnoreThen.java:204)
    at reactor.core.publisher.SerializedSubscriber.onComplete(SerializedSubscriber.java:146)
    at reactor.core.publisher.FluxTimeout$TimeoutMainSubscriber.onComplete(FluxTimeout.java:235)
    at reactor.core.publisher.MonoNext$NextSubscriber.onComplete(MonoNext.java:102)
    at reactor.core.publisher.MonoNext$NextSubscriber.onNext(MonoNext.java:83)
    at reactor.core.publisher.FluxFilterFuseable$FilterFuseableSubscriber.onNext(FluxFilterFuseable.java:118)
    at reactor.core.publisher.FluxReplay$SizeBoundReplayBuffer.replayNormal(FluxReplay.java:877)
    at reactor.core.publisher.FluxReplay$SizeBoundReplayBuffer.replay(FluxReplay.java:965)
    at reactor.core.publisher.FluxReplay$ReplayInner.request(FluxReplay.java:1711)
    at reactor.core.publisher.FluxFilterFuseable$FilterFuseableSubscriber.request(FluxFilterFuseable.java:191)
    at reactor.core.publisher.MonoNext$NextSubscriber.request(MonoNext.java:108)
    at reactor.core.publisher.Operators$MultiSubscriptionSubscriber.request(Operators.java:2331)
    at reactor.core.publisher.SerializedSubscriber.request(SerializedSubscriber.java:151)
    at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.onSubscribe(MonoIgnoreThen.java:135)
    at reactor.core.publisher.SerializedSubscriber.onSubscribe(SerializedSubscriber.java:65)
    at reactor.core.publisher.FluxTimeout$TimeoutMainSubscriber.onSubscribe(FluxTimeout.java:155)
    at reactor.core.publisher.MonoNext$NextSubscriber.onSubscribe(MonoNext.java:70)
    at reactor.core.publisher.FluxFilterFuseable$FilterFuseableSubscriber.onSubscribe(FluxFilterFuseable.java:87)
    at reactor.core.publisher.FluxReplay.subscribeOrReturn(FluxReplay.java:1181)
    at reactor.core.publisher.FluxReplay.subscribe(FluxReplay.java:1153)
    at reactor.core.publisher.FluxAutoConnectFuseable.subscribe(FluxAutoConnectFuseable.java:61)
    at reactor.core.publisher.Mono.subscribe(Mono.java:4568)
    at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.subscribeNext(MonoIgnoreThen.java:265)
    at reactor.core.publisher.MonoIgnoreThen.subscribe(MonoIgnoreThen.java:51)
    at reactor.core.publisher.Mono.subscribe(Mono.java:4568)
    at reactor.core.publisher.FluxFlatMap.trySubscribeScalarMap(FluxFlatMap.java:205)
    at reactor.core.publisher.MonoFlatMap.subscribeOrReturn(MonoFlatMap.java:53)
    at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:63)
    at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:165)
    at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:122)
    at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1865)
    at reactor.core.publisher.MonoCacheInvalidateIf.subscribeOrReturn(MonoCacheInvalidateIf.java:172)
    at reactor.core.publisher.Mono.subscribe(Mono.java:4552)
    at reactor.core.publisher.Mono.block(Mono.java:1806)
    at com.azure.messaging.servicebus.ServiceBusSenderClient.scheduleMessage(ServiceBusSenderClient.java:362)
    at no.nrk.lydbanken.integrasjoner.common.infrastructure.servicebus.ServiceBusQueueConsumer.processMessage(ServiceBusQueueConsumer.java:115)
    at com.azure.messaging.servicebus.MessagePump.notifyMessage(MessagePump.java:157)
    ... 31 common frames omitted
    Suppressed: java.lang.Exception: #block terminated with an error
        at reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:146)
        at reactor.core.publisher.Mono.block(Mono.java:1807)
        ... 34 common frames omitted
Caused by: com.azure.core.amqp.exception.AmqpException: Error sending. Size of the payload exceeded maximum message size: 256 kb, errorContext[NAMESPACE: sbn-lydteamet-lydbanken-prod-azeno-1.servicebus.windows.net. ERROR CONTEXT: N/A, PATH: lydbanken-integrasjoner-virtuoso-updater/$management, REFERENCE_ID: lydbanken-integrasjoner-virtuoso-updater-mgmt:receiver, LINK_CREDIT: 0]
    at com.azure.messaging.servicebus.implementation.ManagementChannel.lambda$schedule$13(ManagementChannel.java:355)
    at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:132)
    ... 145 common frames omitted
Caused by: java.nio.BufferOverflowException: null
    at org.apache.qpid.proton.codec.WritableBuffer$ByteBufferWrapper.ensureRemaining(WritableBuffer.java:152)
    at org.apache.qpid.proton.codec.BinaryType.fastWrite(BinaryType.java:75)
    at org.apache.qpid.proton.codec.EncoderImpl.writeBinary(EncoderImpl.java:535)
    at org.apache.qpid.proton.codec.messaging.FastPathDataType.write(FastPathDataType.java:126)
    at org.apache.qpid.proton.codec.messaging.FastPathDataType.write(FastPathDataType.java:36)
    at org.apache.qpid.proton.codec.EncoderImpl.writeObject(EncoderImpl.java:734)
    at org.apache.qpid.proton.message.impl.MessageImpl.encode(MessageImpl.java:740)
    at org.apache.qpid.proton.message.impl.MessageImpl.encode(MessageImpl.java:696)
    at com.azure.messaging.servicebus.implementation.ManagementChannel.lambda$schedule$13(ManagementChannel.java:349)
    ... 146 common frames omitted
Caused by: java.lang.IndexOutOfBoundsException: Requested min remaining bytes(155) exceeds remaining(137) in underlying ByteBuffer: java.nio.HeapByteBuffer[pos=636 lim=773 cap=773]
    at org.apache.qpid.proton.codec.WritableBuffer$ByteBufferWrapper.ensureRemaining(WritableBuffer.java:148)
    ... 154 common frames omitted

To Reproduce Steps to reproduce the behavior:

This happens seemingly at random, and only a very small percentage of the times we invoke com.azure.messaging.servicebus.ServiceBusSenderClient.scheduleMessage. I don't have a reproducer that can pinpoint the problem at this point, but the full method where the problem happens is attached below. I believe running the code below either by only using the exception path, or by replacing messageConsumer.consume(message) with something like throw new RuntimeException("Oh no") and running it enough times will show the problem.

As mentioned, the message is (usually) scheduled after a few retries.

Code Snippet Add the code snippet that causes the issue.

Our processMessage method (passed on to ServiceBusProcessorClientBuilder.processMessage()) looks like this:

    public void processMessage(ServiceBusReceivedMessageContext serviceBusReceivedMessageContext) {
        ServiceBusReceivedMessage message = serviceBusReceivedMessageContext.getMessage();

        try {
            messageConsumer.consume(message); // Client message processing. Fails with an exception in this case
            serviceBusReceivedMessageContext.complete();
        } catch (Exception exception) {
            int deliveryCount = Optional.ofNullable((Integer) message.getApplicationProperties().get(X_NRK_DELIVERY_COUNT))
                                        .orElse(1);

            if (deliveryCount < 20) {
                int delay = Math.min(1 << (deliveryCount - 1), 120); // Inlined calculation of exponential delay 

                ServiceBusMessage retryMessage = new ServiceBusMessage(message);
                retryMessage.getApplicationProperties().put(X_NRK_DELIVERY_COUNT, deliveryCount + 1);
                serviceBusSenderClient.scheduleMessage(retryMessage, OffsetDateTime.now().plusSeconds(delay)); // Where it fails

                serviceBusReceivedMessageContext.complete();
            } else {
                DeadLetterOptions deadLetterOptions = new DeadLetterOptions()
                                                              .setPropertiesToModify(Map.of(X_NRK_DELIVERY_COUNT, 0)) // reset delivery count to prepare for possibility to put message back to queue later
                                                              .setDeadLetterReason(exception.getClass().getName())
                                                              .setDeadLetterErrorDescription(exception.getMessage());

                serviceBusReceivedMessageContext.deadLetter(deadLetterOptions);
            }
        }
    }

    static final String X_NRK_DELIVERY_COUNT = "x-nrk-delivery-count";

Expected behavior

The message should be scheduled without exception.

Screenshots If applicable, add screenshots to help explain your problem.

Setup (please complete the following information):

Additional context Add any other context about the problem here.

Information Checklist Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report

github-actions[bot] commented 3 months ago

@anuchandy @conniey @lmolkova

github-actions[bot] commented 3 months ago

Thank you for your feedback. Tagging and routing to the team member best able to assist.

anuchandy commented 3 months ago

Hi @haraldk, could you help with a reference message that ran into this error?

Basically, you would update code to print one of the impacted messages, which I can use to reproduce this. The code change look likes

try {
    serviceBusSenderClient.scheduleMessage(retryMessage, OffsetDateTime.now().plusSeconds(delay));
} catch (ServiceBusException sbe) {
    if (sbe.getCause() != null && sbe.getCause() instanceof AmqpException) {
        final AmqpException amqpException = (AmqpException) sbe.getCause();
        if (amqpException.getErrorCondition() == AmqpErrorCondition.LINK_PAYLOAD_SIZE_EXCEEDED) {
            final Throwable innerCause = amqpException.getCause();
            if (innerCause != null && innerCause instanceof BufferOverflowException) {
                System.out.println(innerCause.getMessage());
                MessagePrinter.print(retryMessage);
            }
        }
    }
    throw sbe;
}

The "MessagePrinter" utility class used above can be found here https://gist.github.com/anuchandy/5b08d59e2ac0528dc58b99541c8765f1

haraldk commented 3 months ago

Thanks @anuchandy! I've modified your code slightly to use our logging, but it should otherwise output the same details. Will update the issue with the logged information as soon as possible.

haraldk commented 2 months ago

Here's the output of the extra output from a few occurrences extracted from our logs:

Exception message: null

Message {  header=Header{durable=null, priority=4, ttl=1209600000, firstAcquirer=null, deliveryCount=null} properties=Properties{messageId=c344a9cf16bc4926b5bc3426c88b7810, userId=, to='null', subject='https://lydbanken.kubeint.nrk.no/api/archive/record/cab3ff5d-d576-4269-b3ff-5dd576c269de', replyTo='null', correlationId=null, contentType=application/vnd.nrk.lydbanken.archive.v1+json, contentEncoding=null, absoluteExpiryTime=Thu Sep 12 10:45:10 GMT 2024, creationTime=Thu Aug 29 10:45:10 GMT 2024, groupId='null', groupSequence=0, replyToGroupId='null'} application_properties=ApplicationProperties{{x-nrk-delivery-count=2}} message_annotations=MessageAnnotations{{x-opt-scheduled-enqueue-time=Thu Aug 29 10:45:11 GMT 2024}} delivery_annotations=DeliveryAnnotations{{x-opt-transfer-source=SBN-LYDTEAMET-LYDBANKEN-PROD-AZENO-1:TOPIC:LYDBANKEN-ARCHIVE-EVENTS|LYDBANKEN-INTEGRASJONER-VIRTUOSO-UPDATER#24ab2f09-c7bd-4a17-9c22-366989a8048a[5], x-opt-transfer-sn=21384, x-opt-transfer-session=null, x-opt-transfer-hop-count=1, x-opt-transfer-resource=27911146, x-opt-lock-token=15eea98e-c2dc-4b08-8839-05bf80be5555}} body=Data{{"uri":"https://lydbanken.kubeint.nrk.no/api/archive/record/cab3ff5d-d576-4269-b3ff-5dd576c269de","version":3,"changed":"2024-08-29T10:45:09.758605392Z"}} footer=Footer{{}} }

MessageType {  application_properties={x-nrk-delivery-count=java.lang.Integer, } message_annotations={x-opt-scheduled-enqueue-time=java.util.Date, } delivery_annotations={x-opt-transfer-source=java.lang.String, x-opt-transfer-sn=java.lang.Long, x-opt-transfer-session=NAN, x-opt-transfer-hop-count=java.lang.Integer, x-opt-transfer-resource=java.lang.Long, x-opt-lock-token=java.util.UUID, } }

Exception message: null

Message {  header=Header{durable=null, priority=4, ttl=1209600000, firstAcquirer=null, deliveryCount=null} properties=Properties{messageId=0d6c5bb09140495882da6abfa2e9cb5e, userId=, to='null', subject='https://lydbanken.kubeint.nrk.no/api/archive/record/61054037-fff3-449e-8540-37fff3b49e77', replyTo='null', correlationId=null, contentType=application/vnd.nrk.lydbanken.archive.v1+json, contentEncoding=null, absoluteExpiryTime=Thu Sep 12 10:15:22 GMT 2024, creationTime=Thu Aug 29 10:15:22 GMT 2024, groupId='null', groupSequence=0, replyToGroupId='null'} application_properties=ApplicationProperties{{x-nrk-delivery-count=2}} message_annotations=MessageAnnotations{{x-opt-scheduled-enqueue-time=Thu Aug 29 10:15:29 GMT 2024}} delivery_annotations=DeliveryAnnotations{{x-opt-transfer-source=SBN-LYDTEAMET-LYDBANKEN-PROD-AZENO-1:TOPIC:LYDBANKEN-ARCHIVE-EVENTS|LYDBANKEN-INTEGRASJONER-VIRTUOSO-UPDATER#24ab2f09-c7bd-4a17-9c22-366989a8048a[5], x-opt-transfer-sn=21349, x-opt-transfer-session=null, x-opt-transfer-hop-count=1, x-opt-transfer-resource=27911146, x-opt-lock-token=88f9eabc-7452-4168-a0e1-74b5ba3b5b8d}} body=Data{{"uri":"https://lydbanken.kubeint.nrk.no/api/archive/record/61054037-fff3-449e-8540-37fff3b49e77","version":1,"changed":"2024-08-29T10:15:22.092180924Z"}} footer=Footer{{}} }

MessageType {  application_properties={x-nrk-delivery-count=java.lang.Integer, } message_annotations={x-opt-scheduled-enqueue-time=java.util.Date, } delivery_annotations={x-opt-transfer-source=java.lang.String, x-opt-transfer-sn=java.lang.Long, x-opt-transfer-session=NAN, x-opt-transfer-hop-count=java.lang.Integer, x-opt-transfer-resource=java.lang.Long, x-opt-lock-token=java.util.UUID, } }

Exception message: null

Message {  header=Header{durable=null, priority=4, ttl=1209600000, firstAcquirer=null, deliveryCount=null} properties=Properties{messageId=0d6c5bb09140495882da6abfa2e9cb5e, userId=, to='null', subject='https://lydbanken.kubeint.nrk.no/api/archive/record/61054037-fff3-449e-8540-37fff3b49e77', replyTo='null', correlationId=null, contentType=application/vnd.nrk.lydbanken.archive.v1+json, contentEncoding=null, absoluteExpiryTime=Thu Sep 12 10:15:22 GMT 2024, creationTime=Thu Aug 29 10:15:22 GMT 2024, groupId='null', groupSequence=0, replyToGroupId='null'} application_properties=ApplicationProperties{{x-nrk-delivery-count=2}} message_annotations=MessageAnnotations{{x-opt-scheduled-enqueue-time=Thu Aug 29 10:15:29 GMT 2024}} delivery_annotations=DeliveryAnnotations{{x-opt-transfer-source=SBN-LYDTEAMET-LYDBANKEN-PROD-AZENO-1:TOPIC:LYDBANKEN-ARCHIVE-EVENTS|LYDBANKEN-INTEGRASJONER-VIRTUOSO-UPDATER#24ab2f09-c7bd-4a17-9c22-366989a8048a[5], x-opt-transfer-sn=21349, x-opt-transfer-session=null, x-opt-transfer-hop-count=1, x-opt-transfer-resource=27911146, x-opt-lock-token=88f9eabc-7452-4168-a0e1-74b5ba3b5b8d}} body=Data{{"uri":"https://lydbanken.kubeint.nrk.no/api/archive/record/61054037-fff3-449e-8540-37fff3b49e77","version":1,"changed":"2024-08-29T10:15:22.092180924Z"}} footer=Footer{{}} }

MessageType {  application_properties={x-nrk-delivery-count=java.lang.Integer, } message_annotations={x-opt-scheduled-enqueue-time=java.util.Date, } delivery_annotations={x-opt-transfer-source=java.lang.String, x-opt-transfer-sn=java.lang.Long, x-opt-transfer-session=NAN, x-opt-transfer-hop-count=java.lang.Integer, x-opt-transfer-resource=java.lang.Long, x-opt-lock-token=java.util.UUID, } }

Exception message: null

Message {  header=Header{durable=null, priority=4, ttl=1209600000, firstAcquirer=null, deliveryCount=null} properties=Properties{messageId=d6059e2e70304d3a922f71b2d944f462, userId=, to='null', subject='https://lydbanken.kubeint.nrk.no/api/archive/record/658825c8-b0d5-484e-8825-c8b0d5f84ea6', replyTo='null', correlationId=null, contentType=application/vnd.nrk.lydbanken.archive.v1+json, contentEncoding=null, absoluteExpiryTime=Thu Sep 12 10:15:15 GMT 2024, creationTime=Thu Aug 29 10:15:15 GMT 2024, groupId='null', groupSequence=0, replyToGroupId='null'} application_properties=ApplicationProperties{{x-nrk-delivery-count=2}} message_annotations=MessageAnnotations{{x-opt-scheduled-enqueue-time=Thu Aug 29 10:15:17 GMT 2024}} delivery_annotations=DeliveryAnnotations{{x-opt-transfer-source=SBN-LYDTEAMET-LYDBANKEN-PROD-AZENO-1:TOPIC:LYDBANKEN-ARCHIVE-EVENTS|LYDBANKEN-INTEGRASJONER-VIRTUOSO-UPDATER#24ab2f09-c7bd-4a17-9c22-366989a8048a[5], x-opt-transfer-sn=21324, x-opt-transfer-session=null, x-opt-transfer-hop-count=1, x-opt-transfer-resource=27911146, x-opt-lock-token=b4fd5abc-184a-44a9-9bb4-eac3797588b3}} body=Data{{"uri":"https://lydbanken.kubeint.nrk.no/api/archive/record/658825c8-b0d5-484e-8825-c8b0d5f84ea6","version":2,"changed":"2024-08-29T10:15:15.570698376Z"}} footer=Footer{{}} }

MessageType {  application_properties={x-nrk-delivery-count=java.lang.Integer, } message_annotations={x-opt-scheduled-enqueue-time=java.util.Date, } delivery_annotations={x-opt-transfer-source=java.lang.String, x-opt-transfer-sn=java.lang.Long, x-opt-transfer-session=NAN, x-opt-transfer-hop-count=java.lang.Integer, x-opt-transfer-resource=java.lang.Long, x-opt-lock-token=java.util.UUID, } }

The exception message wasn't very helpful in this case, but I hope the extra details can help in finding the issue! 😀

anuchandy commented 2 months ago

Hi @haraldk, thanks! it looks like the value of the field "Message.Properties.userId" is redacted, possibly a filter in the logging configuration? Could you double check and if so, can we output few new impacted messages again including this field value?

anuchandy commented 2 months ago

Hi @@haraldk, I can repro using the shared messages, no need to pull another set of messages (with userId)

anuchandy commented 2 months ago

hello @haraldk, we have a fix for this https://github.com/Azure/azure-sdk-for-java/pull/41706, thanks for helping with the details and test messages!

haraldk commented 2 months ago

Excellent news! I’ll have look, and will test as soon as possible. 😀

anuchandy commented 2 months ago

The version (7.17.14) with the fix should be available in September release. Closing, now that the changes are merged to the main.