Particular / NServiceBus.RabbitMQ

RabbitMQ transport for NServiceBus
https://docs.particular.net/nservicebus/rabbitmq/
Other
89 stars 58 forks source link

Retries of headerless message causes indefinite requeuing on classic queues #1222

Closed mstavrev closed 1 year ago

mstavrev commented 1 year ago

Description

When integrating NServiceBus with external systems it is possible to receive a headerless message (still having correct type on the Properties to identify the handler).

When it happens so that the handler of such a message throws an exception (for whatever processing logic error) a new exception occurs on MessagePump's GetDeliveryAttempts due to the implementation not accounting for the possibility of null Headers. The exception causes the message to be NAK-ed to RabbitMQ broker and re queued. It then gets picked up again and the process repeats until either the message's TTL expires or indefinitely if the message/queue lacks expiration settings.

I've encountered the problem on 8.0.1. It is due to this statement: https://github.com/Particular/NServiceBus.RabbitMQ/blob/daa1824378a7bfdbdbd62b6b16e811ca0593996b/src/NServiceBus.Transport.RabbitMQ/Receiving/MessagePump.cs#L467

The problem may be new on 8.x or it might had existed on older versions, but masked by using (no longer possible on 8.x) TransportTransactionMode.None that would drop the message instead of re-queue.

Expected behavior

Normal error handling to occur, as for other messages that contain non-null headers.

Actual behavior

The message "poisons" the queue, making the handling of others being enqueued potentially not serviced (for example, when there is large number of such headerless messages that cause exception due to intermittent issue on the handler).

Versions

The bug was introduced in:

Steps to reproduce

  1. Create a message handler with intentional exception raising logic
  2. Bring up a bus on a classic queue to receive messages of the said type.
  3. Enqueue a message that would be picked by the handler (the message has correct type), but no headers.

Relevant log output

2023-04-03 14:29:32.8643 [84] ERROR NServiceBus.Transport.RabbitMQ.MessagePump Message processing failed. System.NullReferenceException: Object reference not set to an instance of an object.
   at NServiceBus.Transport.RabbitMQ.MessagePump.GetDeliveryAttempts(BasicDeliverEventArgs message, String messageIdKey) in /_/src/NServiceBus.Transport.RabbitMQ/Receiving/MessagePump.cs:line 460
   at NServiceBus.Transport.RabbitMQ.MessagePump.Process(AsyncEventingBasicConsumer consumer, BasicDeliverEventArgs message, CancellationToken messageProcessingCancellationToken) in /_/src/NServiceBus.Transport.RabbitMQ/Receiving/MessagePump.cs:line 446
   at NServiceBus.Transport.RabbitMQ.MessagePump.ProcessAndSwallowExceptions(AsyncEventingBasicConsumer consumer, BasicDeliverEventArgs message, CancellationToken messageProcessingCancellationToken) in /_/src/NServiceBus.Transport.RabbitMQ/Receiving/MessagePump.cs:line 348
System.NullReferenceException
Int32 GetDeliveryAttempts(RabbitMQ.Client.Events.BasicDeliverEventArgs, System.String)
   at NServiceBus.Transport.RabbitMQ.MessagePump.GetDeliveryAttempts(BasicDeliverEventArgs message, String messageIdKey) in /_/src/NServiceBus.Transport.RabbitMQ/Receiving/MessagePump.cs:line 460
   at NServiceBus.Transport.RabbitMQ.MessagePump.Process(AsyncEventingBasicConsumer consumer, BasicDeliverEventArgs message, CancellationToken messageProcessingCancellationToken) in /_/src/NServiceBus.Transport.RabbitMQ/Receiving/MessagePump.cs:line 446
   at NServiceBus.Transport.RabbitMQ.MessagePump.ProcessAndSwallowExceptions(AsyncEventingBasicConsumer consumer, BasicDeliverEventArgs message, CancellationToken messageProcessingCancellationToken) in /_/src/NServiceBus.Transport.RabbitMQ/Receiving/MessagePump.cs:line 348

Additional Information

Workarounds

Possible solutions

Add an extra check on line message.BasicProperties.IsHeadersPresent() and/or validate message.BasicProperties.Headers is not null before looking for the presence of the "x-delivery-count" header.

Backported to

dvdstelt commented 1 year ago

Hi @mstavrev,

This looks like a critical bug. We're looking into it, trying to reproduce this within a test and fix it. We'll keep you posted on how we progress.

andreasohlund commented 1 year ago

@mstavrev we've confirmed this to be a bug and started working on a fix https://github.com/Particular/NServiceBus.RabbitMQ/pull/1223

andreasohlund commented 1 year ago

@mstavrev I've made some updates to this issue and we'll release a fix shortly. The key change is to clarify that this only happens for classic queues, see https://github.com/Particular/NServiceBus.RabbitMQ/pull/1224, and that switching to a quorum queue could be a workaround.

Can you confirm that you are on classic queues?

mstavrev commented 1 year ago

Hi, thanks for the update. Yes, I can confirm we are using classic queues.

andreasohlund commented 1 year ago

@mstavrev https://github.com/Particular/NServiceBus.RabbitMQ/releases/tag/8.0.2 has been released with the fix