antirez / disque

Disque is a distributed message broker
BSD 3-Clause "New" or "Revised" License
8.01k stars 538 forks source link

Strict ordering for jobs #179

Open ambasta opened 8 years ago

ambasta commented 8 years ago

Hi,

Consider messages M1, M2, ... Mn, with each message having a feature x such that M1.x = x1, M2.x = x2, ..., Mn.x = xn.. Now consider another message Mz where Mz.x = x1

Given say, p consumers waiting for jobs, is it possible to enforce that Mz is strictly processed only after M1 has been processsed. Meanwhile M1, M2, ..., Mn can be processed in any order

ambasta commented 8 years ago

Message Groups in ActiveMQ exemplify this use case perfectly.

mathieulongtin commented 8 years ago

Have M1.x create Mz.x1 before its done.

Envoyé de mon iPad

Le 16 mars 2016 à 05:47, Amit Prakash Ambasta notifications@github.com a écrit :

Message Groups in ActiveMQ exemplify this use case perfectly.

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub

ambasta commented 8 years ago

That won't work.

Lets say we have 3 consumers C1, C2, C3 and the messages are pushed in the order M1, M2, Mz, M3, ...

In this case C1 will pickup M1, C2 will pickup M2 and C3 will pick up Mz thus violating the requirement that Mz be processed exclusively after M1

mathieulongtin commented 8 years ago

It seems Mz depends on M1. When C1 picks up M1, after it's done, it creates Mz. Disque doesn't support job dependencies at this point.

On Wed, Mar 16, 2016 at 8:28 AM Amit Prakash Ambasta < notifications@github.com> wrote:

That won't work.

Lets say we have 3 consumers C1, C2, C3 and the messages are pushed in the order M1, M2, Mz, M3, ...

In this case C1 will pickup M1, C2 will pickup M2 and C3 will pick up Mz thus violating the requirement that Mz be processed exclusively after M1

— You are receiving this because you commented.

Reply to this email directly or view it on GitHub https://github.com/antirez/disque/issues/179#issuecomment-197295193

Mathieu Longtin 1-514-803-8977

ambasta commented 8 years ago

It seems Mz depends on M1. When C1 picks up M1, after it's done, it creates Mz. Disque doesn't support job dependencies at this point.

No. M1 and Mz have been created by a producer in a strict order, not by the consumer. What I am trying to request (as a feature if unsupported) is the ability to ensure ordered delivery of messages based on some message parameter, a functionality identical to Message Groups in ActiveMQ

antirez commented 8 years ago

Hello, in general you can force ordering in a not-ordered queue, if you have some way to keep state during job processing.

So imagine M1, M2, M3, and you want the messages to be processed into this strict order 1, 2, 3. You assign each message a random unique ID:

M1.id = 3f786850e387550fdab836ed7e6dc881de23001b
M2.id = 89e6c98d92887913cadf06b2adb97f26cde4849b
M3.id = 2b66fd261ee5c6cfc8de7fa466bab600bcfe4f69

Each message also has a previous_id field. (The id and previous id are part of the message body, whatever it is, JSON, XML, what you want).

The previous_id of M1 is set to 0000000000000000000000000000000000000000 to signal that M1 can be processed whatever is the previous state of the object it operate upon, M2 previous_id is instead set to M1 id and so forth.

So what happens is that the messages, when operating to some resource, set the previous state in the data store used. Even if the messages will arrive not ordered, the state field of the store (assuming it's a consistent state), will force the message to be executed in the right order. If M2 arrives before M1, it will be discarded, later M1 will arrive or be scheduled again and retransmitted, and eventually it will be processed. Now M2 can be executed. And so forth...

I hope this solves your problem. There are no plans to add something built-in for in order processing into Disque at the moment. There are already too many queues attempting to do that, and the design sacrifice leading to this feature to be removed is the same that opens so many possibilities for Disque, in my opinion.

ambasta commented 8 years ago

Hey antirez,

Thanks for the response. While I understand the approach you've outlined above, this won't work for a distributed producer or multiple producers. I understand the challenges and limitations of such an approach however.

Thanks for the reply