Open AlexGilleran opened 6 years ago
I think it's a very good idea +1
SPIKE to figure out the best architechture approach
Makes sense to me. I was wondering, though: why create a separate row for each event/subscription pair? Instead of the registry's current approach of associating a "last event ID seen" with each subscription? There's a good chance you explained why and I just missed it.
@kring So you can have retries (although I might not have thought this through enough)
Say you have n events and two instances of the same minion consuming the event stream.
Minion 1 starts up and grabs events 1-10, they're set to pending
for that subscription
Minion 2 starts up and grabs events 11-20, they're set to pending
for that subscription
Minion 1 gets goes down (maybe the node failed or something)
Minion 1 starts up again, grabs events 21-30 because those are the next non-pending non-completed events. Events 1-10 are still in state pending
Minion 2 completes its page, sets 11-20 to done
, grabs 31-40.
some time passes, etc, etc *
Minion 1 grabs the next page of events, but enough time has passed that the lastmodified
timestamp on events 1-10 is greater than the retry threshold. So rather than get the next page of events (41-50 or greater), it gets 1-10, their lastmodified (maybe this should be called something different) is set to now()
and they remain in state pending
Minion 1 doesn't crash this time and completes events 1-10, so they're set to done
Just keeping track of the last event id doesn't allow for that retry logic... but doing it this way means that events won't necessarily be processed in order. But I think that's OK - it just becomes encumbent upon the consumers of the events not to trust that the data in the event represents the most up-to-date state of the system... which it shouldn't be doing anyway because the event might be 3 months old for all it knows.
Just keeping track of the last event id doesn't allow for that retry logic
Ah ok, that makes sense.
I think you could achieve that retry logic without duplicating every event for every subscription, though. I'm sure I haven't thought this through enough either, but I think you could store the ranges of pending events in a separate table.
So each subscription would have:
lastmodified
timestamp for the ones that are pending.So when a minion asks for the next page:
When events are done being processed:
I guess that's a lot more complicated, so maybe it's not worth it. Storing multiple copies of every event makes me nervous, though. There are lots of events.
Storing multiple copies of every event makes me nervous, though. There are lots of events.
🤔 yeah. Although hopefully either way (whether keeping a row per event or row per range) it's safe to delete ones that are done. Although that poses a question as how can you be absolutely sure that you're creating a subscriptionevent (or subscriptioneventrange) for each subscription every time there's a new event 😬
True, it's safe to delete the ones that are done. But still, if there is an event for every subscription, and there are 10 subscriptions, handling a new event (which connectors can generate, I dunno, tens to hundreds of per second, maybe more) requires writing 10 times as much stuff.
In the scheme I outlined above, I don't think it's necessary to change the subscriptioneventrange table for new events. It only needs to be modified when a minion asks for events to process or when it marks them done. This is good because it makes inactive subscriptions (where no one is processing the events) basically free. If a new event is created per subscription, inactive subscriptions are super expensive because their events never get deleted.
Good job! @AlexGilleran
Comprehensive & Very impressive design 👍
I like it -- especially the sending events to broker
part.
just a few questions regarding the event db structure design:
Considering most event related use cases are expected to process events in order, would it make sense to leave the processing events in parallel
feature so that we can remove the status
on the event records and convert it to a much simpler first in first out
event queue structure in database (thus, potentially could be operating faster)?
-i.e.:
createTime
on events for event expiring & no other management fieldsevent
table probably will help to make DB run fasterWe probably can maintain a seperate event log facility for catch-ups
feature considering it might not be a frequent operation
elasticsearch
catch-ups
expired events
Intro
Currently Magda uses event sourcing, but it’s entirely limited to the registry - to work with event sourcing, data has to live in the registry and be changed by the registry. It’d be good to have this separate so we could use it for data outside the registry - e.g. when something in the Content API changes, we could notify other services that use it to update their cached value.
Rationale
Design
Event Broker
Getting Events from the Broker
Sending Events to the Broker
Catching Up
External Webhooks
Diagram