Open erikash opened 8 years ago
Do you mean the on_success/on_error arguments to Event.send?
These callbacks are there for async programs (think asyncio, tornado, twisted, gevent, eventlet), and are not actually called when you use celery as the dispatcher (as you would have to serialize the functions). They won't be useful until you connect thorn to an event loop.
Success callbacks are called as soon as there is a response from the web request, so ordering depends on how fast the URLs respond, same with errbacks.
Hi, I didn't refer to the async callbacks. I'll rephrase the question : Suppose the user updates his username twice: a -> b and b -> c, which would trigger two http webhook callbacks for subscribed consumers (other microservices for example). What would be the ordering guarantee between the aforementioned updates? will the first update a-> b is guaranteed to be delivered to subscribers before b -> c? even if there was an error in a dispatching attempt?
If the behaviour is different between dispatchers (celery, tornado, asyncio), then please elaborate.
Does the question make more sense now?
Oops, I'm very sorry for the late reply, but my Github notification queue is miles deep :(
There's no ordering guarantees in the case you mention where a user updates his username twice.
So even at the database level you cannot guarantee the order, but you will regard the value winning in the database to be the now valid value. Sadly, we will face the same problems when dispatching the webhooks
It's impossible to solve this problem in a distributed system: you could have a distributed lock but I don't think they suit this purpose, or we could use Lamport timestamps/vector clocks, but then AFAIU the webhook consumers will be required to take an active part in the system.
RabbitMQ will have some ordering guarantees such that if you send two messages from the same connection they will be received in the same order, but that falls apart if a consumer rejects the message, dies in the middle of processing, or if there's a partition in a HA setup.
So messages coming from multiple clients are impossible to order, but if you consider the data in the database to be the consistent state, then you could work around this by demanding that webhook consumers refetch the data when they receive it.
For example when you receive the message:
{"event": "user.changed",
"data": {"username": "george", ...},
"ref": "http://example.com/user/3124/"}
You make a request to http://example.com/user/3124
to get the canonical version of the data.
Even in this case the data may change between making the request and
receiving the HTTP response. This should illustrate how hopelessly difficult it is to keep data consistent
in a distributed environment, and instead of thinking about ordering guarantees you should consider
how you can make state updates be idempotent.
What will the endpoint use the data for? Will they associate data with the username in an internal database, then make sure usernames cannot be reused and that historical usernames point to the new name.
Thanks for the elaborate response! 👍
I completely agree with you regarding ordering of commands (e.g update username), it's practically impossible to synchronise two mouse clicks.
On the other hand there are a few ways for consumers to process the notifications in-order:
If the subscribers are not processing notifications in-order, how would they handle the following stock market scenario:
Order state is updated and the following web-hooks are published after an IOC order is executed by the exchange:
The subscriber received the web-hooks in reverse order:
How will a subscriber handle this scenario? (assuming that fetching the canonical version will not provide enough context)
I'm facing these challenges myself so i'm eager to hear your opinion!
Thanks, Erik.
Bump!
Hi, Great stuff! 👍
I didn't see any info in the documentation regarding ordering guarantees of posting callbacks (in success and failure scenarios).
Could you please elaborate?
Thanks! Erik.