Particular / NServiceBus.AmazonSQS

An AWS SQS transport for NServiceBus.
Other
35 stars 20 forks source link

FIFO queues support #240

Open VladimirMakaev opened 6 years ago

VladimirMakaev commented 6 years ago

We currently use FIFO queues in production because our system depends on the ordering. We have to process the messages in the same order they are generated.

Since it's the same SDK API to receive a message I was able to get the receiving endpoint consume messages from FIFO queue. However certain problems do arise:

  1. I have to use Native Send (https://docs.particular.net/transports/sqs/operations-scripting#native-send-the-native-send-helper-methods-in-c) to send a message since MessageGroupId and DeduplicationId are not provided by transport at the moment.
  2. Concurrency Limit currently determines the MaxNumberOfMessages for ReceiveMessageRequest and a number of pumps. Those pumps are processing messages regardless of their MessageGroupId that would break the order that I expect from the FIFO. So I have to stick with Concurrency Limit of 1 to keep the ordering invariant in place which creates a bottleneck. We have a broad range of MessageGroupIds and can leverage concurrency in a more efficient way.
  3. Due to throttling limitations batching here is a Must for publishing. NativeSend is an option in the meanwhile (though but it would be nice if you wrap it and provide together with Transport package that can interpret the same Serializer that endpoint would have configured and accepts a message + sqs specific parameters)
  4. If immediate retry fails for a message we have to give up processing the whole group.

What are your current plans to support FIFO for this transport package?

danielmarbach commented 6 years ago

Hi @VladimirMakaev

Thanks for your input. We'll look at this closer but I doubt we will support this. The abstractions that NServiceBus provides make a few assumptions geared towards general purpose messaging on top of queues. In many of our talks, webinars, workshops as well as in support cases we recommend our customers to not rely on ordering of message in the queue. It simply cannot be achieved reliably with queues. See for example

For many cases, it is possible to talk to the business and actually loosen the ordering constraints, rely on retries or use sagas to coordinate more complex processes that require a certain order (we use optimistic or pessimistic locking depending on the persister) as well as enforce SLAs on top of that.

Hope that helps Daniel

VladimirMakaev commented 6 years ago

In this particular case it's about how legacy system works. I understand that depending on message order is not great however that is a reality where we are at the moment. What SQS FIFO queues give us is the simplest acceptable solution for the use case we have. But we also want to better integrate the legacy parts of the system with a new parts and gradually "evolve" different components and replace the parts where architecture is not ideal.

This is where we see the opportunity to effectively use NServiceBus and it provides us a great framework to build features rather than building plumbing. SQS as a transport provides certain features that we want to leverage that make our integration simpler and more robust it's not about breaking assumptions of NServiceBus and we don't expect the code to "just work" if we switch the transport. You have transport specific behaviour already (e.g. Transactions) so this could be a transport specific behavior as well which closes the gaps of what you can do directly with AWSSDK vs NServiceBus.AmazonSQS transport. And I guess you don't have to modify the core library to support fifo queues better.

danielmarbach commented 6 years ago

@Particular/aws-maintainers any objections from closing this? We are not going to move on this in the near future.

mauroservienti commented 6 years ago

I’m fine with closing this. However what’s the problem with supporting FIFO queues? There can be a clear adavantage from the consumer perspective as, if I understood correctly, there is native deduplication support, which means that there is no pressure to implement idempotency at the message handler level.

Where lies the complexity, or long term problem, of supporting them?

danielmarbach commented 3 years ago

We have further discussed this during the current enhancement release and have decided to not tackle it and focus on other things first. There are several reasons that make this feature quite involved to support and we are not sure if it is worth the complexity it exposes the transport to given the small requests we got for this feature and that there are high-level alternatives for it.

For example, currently, the transport is geared towards making assumptions that it feeds from a regular SQS queue with very high scalability. By supporting FIFO queues as the main input queue the transport would require a dedicated internal pump and potentially settings to tweak the characteristics of this pump. The pump would need to deal with the quite low limits of an SQS queue. On top of that, we currently derive the FIFO queue for the delayed delivery by convention from the endpoint name. Similarly, we do the same for the destinations we send to. By introducing FIFO queue support for the endpoint queue the native delayed delivery FIFO approach would need to be changed to allow overriding the delayed delivery FIFO queue name. Routing would then also be more complex because the user would need to be able to tell the transport whether the destination uses a regular endpoint derived delayed delivery queue or an overridden one.

In general, we also strive to have a reasonable feature set across the transports and try to make sure the transport stays within reasonable complexity boundaries. It is always a complex tradeoff between what should be supported by the transport by enabling native features vs in what ways should the transport stay to the common transport feature set. We think in certain cases it is OK to fall back to native SDK usage or use another feature like outbox or sagas to achieve a certain behavior.

That doesn't mean this decision is written in stone, it just means we are not yet going to do it in the next enhancement release.

DavidBoike commented 1 year ago

For completeness, in May 2021 Amazon announced general availability of high throughput mode for FIFO queues, "allowing you to process up to 3000 messages per second per API action. This is a tenfold increase compared to current SQS FIFO queue throughput quota."