Custom receive timeout for nack with SQS using JMS

enVolt commented 5 years ago

The behavior of negative acknowledgment is to change the visibility timeout of a received message to 0. Where the value of NACK_TIMEOUT is not configurable while creating the SQS Factory for JMS.

https://github.com/awslabs/amazon-sqs-java-messaging-lib/blob/master/src/main/java/com/amazon/sqs/javamessaging/acknowledge/NegativeAcknowledger.java#L99

When a message is received, and the processing fails (Listener method throws an error), the message is immediately received again. In most of the cases, the message can be processed with a certain delay.

Is it possible to configure it to not change the visibility timeout, so it respects the Queue's default Receive Timeout configuration?

PS - I originally asked on Stackoverflow

enVolt commented 5 years ago

Anyone?

robertpila commented 5 years ago

I am having the same issue. This solution or the ability to use the SQS Default Visibility Timeout would be nice.

cmyker commented 5 years ago

Same here, immediate re-delivery on NACK results in message quickly ending up in dead letter

AlexRosier commented 5 years ago

We are having the same problem. Using the default that is set on sqs seems to be better behavior. This seems to be done here : https://github.com/awslabs/amazon-sqs-java-messaging-lib/pull/72 but the pull request is closed

vghero commented 4 years ago

Same here. New news on this?

cmyker commented 4 years ago

Workaround on stackoverflow https://stackoverflow.com/a/55187272/1079659

adressin commented 4 years ago

Thanks for the workaround @cmyker! Downside is it affects all change visibility requests... but it solves my immediate issue. +1 for incorporating an option into the library

sedooe commented 4 years ago

See my answer here, I believe this is the most ideal way to keep visibilityTimeout set on the queue itself if you're using Spring JMS.

This default behavior of the library is pretty strange though. If you were going to set it to 0 for every failed message by default, then why do we have visibilityTimeout attribute on the queue itself in the first place?

AlexandreGuidin commented 4 years ago

For those who use spring-jms: I have just tested the @sedooe solution in my local environment with an elasticmq, and it worked.

public class CustomJmsListenerContainerFactory extends DefaultJmsListenerContainerFactory {

    @Override
    protected DefaultMessageListenerContainer createContainerInstance() {
        return new CustomMessageListenerContainer();
    }
}

public class CustomMessageListenerContainer extends DefaultMessageListenerContainer {

    public CustomMessageListenerContainer() {
        super();
    }

    @Override
    protected void rollbackOnExceptionIfNecessary(Session session, Throwable ex) throws JMSException {
 // do nothing
    }

    @Override
    protected void rollbackIfNecessary(Session session) throws JMSException {
        super.rollbackIfNecessary(session);
    }
}

And then:

        CustomJmsListenerContainerFactory factory = new CustomJmsListenerContainerFactory();

I didn't like it too much, but it solves my problem.

AlexandreGuidin commented 4 years ago

I think spring have much to learn with Akka/Alpakka, when it comes to messaging

https://doc.akka.io/docs/alpakka/current/sqs.html

sedooe commented 4 years ago

I think spring have much to learn with Akka/Alpakka, when it comes to messaging

https://doc.akka.io/docs/alpakka/current/sqs.html

I'm glad it worked for you but this strange behavior is coming from this amazon-sqs-java-messaging-lib and has nothing to do with Spring. 🙂

AlexandreGuidin commented 3 years ago

I think spring have much to learn with Akka/Alpakka, when it comes to messaging https://doc.akka.io/docs/alpakka/current/sqs.html

I'm glad it worked for you but this strange behavior is coming from this amazon-sqs-java-messaging-lib and has nothing to do with Spring. 🙂

Sorry, my mistake I was so frustrated that i wasn't thinking straight haha

vghero commented 3 years ago

More than 2 years later still having to use the workaround :)?

danmoorenhs commented 2 years ago

This default behavior of the library is pretty strange though. If you were going to set it to 0 for every failed message by default, then why do we have visibilityTimeout attribute on the queue itself in the first place?

this is exactly what is perplexing us.

we actually have been relying on the visibilityTimeout to give us a simple backoff strategy, only today have we realised that if there is an error downstream or in handling a particular message, the result of this forcing to no wait is that it will attempt to DOS the downstream service.

at first and second looks this is very wrong :(

jahlbornG commented 1 year ago

I too am saddened that there is no fix for this yet. We process messages which involve calls to external apis which have rate limiting. when messages are retried due to rate limit errors, the immediate retries only exacerbate the problem and increase the chances that the message will quickly exhaust all the retries and fail completely.

i investigated some of the options above. @sedooe's solution of overriding the rollbackOnExceptionIfNecessary() method is a fairly simple solution, however, when i looked into the the recover() method which is skipped, i noticed that the unacked messages won't be cleaned out. i'm a little worried that that could cause issues over time.

ultimately, we are forced to go the route of forking the library and directly modifying the negative ack logic for now. would love to see this issue fixed.

kylejPomelo commented 3 months ago

here's a somewhat hackey solution that works by creating an ExecutionInterceptor, and it doesn't rely on the queue's message visibility expiring, and you can configure what's essentially a fixed backoff with a value of your choice

    SqsClientBuilder clientBuilder = SqsClient.builder()
        .overrideConfiguration(ClientOverrideConfiguration.builder()
            .addExecutionInterceptor(new ExecutionInterceptor() {
              @Override
              public SdkRequest modifyRequest(Context.ModifyRequest context,
                  ExecutionAttributes executionAttributes) {
                if (context.request() instanceof ChangeMessageVisibilityBatchRequest request) {
                  List<ChangeMessageVisibilityBatchRequestEntry> newEntries = request.entries()
                      .stream()
                      .map(entry -> entry.toBuilder()
                          .visibilityTimeout(sqsProps.getFixedBackoffSeconds())
                          .build())
                      .toList();
                  return request.toBuilder().entries(newEntries).build();
                }
                return context.request();
              }
            })
            .build())
        .region(Region.US_WEST_2);

by default the aws jms sdk will set the visibility timeout to 0 when there's an exception processing a message, this allows us to override 0 to some other value to delay the retry

awslabs / amazon-sqs-java-messaging-lib

Custom receive timeout for nack with SQS using JMS #75