Revison guard store horizontal scaling

thenativeweb / node-cqrs-eventdenormalizer

Node-cqrs-eventdenormalizer is a node.js module that implements the cqrs pattern. It can be very useful as eventdenormalizer component if you work with (d)ddd, cqrs, domain, host, etc.

http://cqrs.js.org/pages/eventdenormalizer.html

MIT License

38 stars 27 forks source link

Revison guard store horizontal scaling #82

Open tommiii opened 5 years ago

tommiii commented 5 years ago

Is possible to change the db for the revision guard store? I see it saves the events in memory but I've some concerns about the horizontal scaling. Are there some limits? What happens if I'm missing a huge amount of events?

nanov commented 5 years ago

  // optional, default is in-memory
  // currently supports: mongodb, redis, tingodb, dynamodb and inmemory
  // hint settings like: [eventstore](https://github.com/adrai/node-eventstore#provide-implementation-for-storage)
  revisionGuard: {
    queueTimeout: 1000,                         // optional, timeout for non-handled events in the internal in-memory queue
    queueTimeoutMaxLoops: 3,                    // optional, maximal loop count for non-handled event in the internal in-memory queue
    startRevisionNumber: 1,         // optional, if defined the denormaizer waits for an event with that revision to be used as first event

    type: 'redis',
    host: 'localhost',                          // optional
    port: 6379,                                 // optional
    db: 0,                                      // optional
    prefix: 'readmodel_revision',               // optional
    timeout: 10000                              // optional
    // password: 'secret'                          // optional
  }

Source: Readme.

tommiii commented 5 years ago

From the code you just posted, the comment about queueTimeout and queueTimeoutMaxLoops says the events are in-memory queue. I'm wondering what are the limits of this 'in-memory queue'.

nanov commented 5 years ago

Your original question was about different db implementations and horizontal scaling.

In order to achieve true horizontal scaling, you'll need to use some sort of db ( most probably redis ), which the quoted section explains how.

I am not sure what you mean by 'limits' of the in-memory implementation, but i wouldn't suggest it for production use, mainly because it has no persistence whatsoever ( hence the name ), which could lead to unwanted results by app restart.

alemhnan commented 5 years ago

Hi @nanov, I'll chime in.

We use a db for the revision guard store (mongodb in our case). We did a few tests to get a better understanding of the behavior of the guard store. We noticed that when we run this series of events for a specific aggregates: 1-3-4-5-2. The guardstore will store the events 3-4-5 until 2 is received, it will then run 3-4-5. We were not able to observe those events in the datastore itself. The collection did not had those events stores (only the 'LAST_SEENEVENT' and the last revision seen for that aggregate.)

We therefore assumed that those events are stored in-memory (by a combo of misunderstanding the doc and our observations). And of course I was a bit concerned about the in-memory situation.

Do you know if what we actually observed is true? Or do you know why we did not see those events stored in the mongodb collections? I'm going anyway to retry those cases to see if we did some mistake (most likely).

nanov commented 5 years ago

It's hard to tell without some actual test case.

The guardstore does not store events, rather revisions ( of aggregates ). When a missing event(s) is detected ( based on revisionNumber ), it waits ( while storing this event in memory ) for some ( preconfigured ) amount of time for the missing events to arrive, in case they arrive they are applied ( in order ) and then the ( stored ) event is applied as-well. In case they don't arrive a missingEvent callback is fired which allows you to ask for and apply the missing events by yourself.

alemhnan commented 5 years ago

Ok, thanks for the explanation. That match our observed behavior.

But a curiosity, how do you scale horizontally the denormalizers? Since some events might be stored in memory you preserve state in a non shared area. Therefore you can't plug another set of denormalizers in parallel. Would be possible, and would make sense, to store those temporary events in the same db (where for instance LAST_SEEN_EVENT is stored)

nanov commented 5 years ago

Well, as events needs to be processed in-order ( that is what the revision-guard store ensures), horizontal scaling of a single denormalizer instance is not so trivial. Running the exact same set in parallel wouldn't do it.

You could thought run two ( different sets ), in same ( clustered ) process in order to optimize resource usage.

One way to achieve this is with right queue management on the message bus side, something like assigning a queue to each ( different ) set ( respectively also it's own guardstore ), then you should be able to run those in parallel and process events in-order.

As missing event handling happens only on the current node, the in-memory implementation shouldn't be a problem ( ie. next event is taken from the queue only when this one is processed, guardstore should have no waiting time, as no new events are received until the current is marked as handled ).