kgrzybek / modular-monolith-with-ddd

Full Modular Monolith application with Domain-Driven Design approach.
MIT License
11.12k stars 1.75k forks source link

Unable to scale. #72

Open Stalso opened 4 years ago

Stalso commented 4 years ago

Hello from Belarus! First of all thank you for you project and I also want to be friends with you!)

I have one question. In your blog you say, that one of disadvantages of monolith is that during horizontal scaling we need to deploy whole system, but monolith could be deployed in general. But if we look at your current implementation it could not be scaled at all because of processors (ProcessOutboxJob for example). Am I right?

If I am right, do you have some proposition, how scaling could be achieved?

For example, we can move ProcessOutboxJob to separate process. It can poll database and read outbox as it works now. Then we can scale rest of app. But when our ProcessOutboxJob will publish events to some bus (RabbitMQ) and have multiple instances of apps, that consume this messages, we will lose message ordering, which is critical. So, this is bad solution. And, to be honest, I do not know what to do. I really have no idea how to scale apps, that use Outbox pattern. Or read sides of EventSourced systems, for example

kgrzybek commented 4 years ago

Hi @Stalso !

There are two perspectives to consider - producers and consumers.

  1. Producer

In Outbox Pattern producer is called the Message Relay. More information about this pattern you can find https://www.enterpriseintegrationpatterns.com/patterns/conversation/Relay.html and https://microservices.io/patterns/data/transactional-outbox.html

You want to have only one producer for a given message (to avoid duplicates) so 2 solutions are available: a) You have multiple Message Relays but you need to lock outbox table rows during processing. So when you are selecting particular row you need to lock this row (using SELECT for UPDATE, query hint or other - depends on the database) b) You have only one Message Relay as you described

Answering your question - actual implementation in this repository doesn't support horizontal scaling - it needs implementation a) or b)

Of course, even you would apply one of the solutions mentioned above duplicates still may appear because Outbox Pattern guarantees at-least-once delivery - more information you can find in my article here http://www.kamilgrzybek.com/design/the-outbox-pattern/

This leads us to consumers...

  1. Consumer

If you have only one consumer, as I wrote above you still can receive duplicates so your consumer should be idempotent - https://www.enterpriseintegrationpatterns.com/patterns/messaging/IdempotentReceiver.html

If you have multiple physical consumers which are part of one logical consumer (like 3 instances of the same service) of course you want to process the message only once. That scenario is called Competing Consumers https://docs.microsoft.com/pl-pl/azure/architecture/patterns/competing-consumers. You are right that in this case order of message is not guaranteed. How to solve it? It depends on your domain. Sometimes the order of events is important, sometimes not. Each case should be considered separately.

The classic example is when you receive SomethingDeleted before SomethingCreated. If you expect that this situation can happen you should store information when something was deleted and check this date when SomethingCreated appears - if earlier then do not create this entity.

I will only add that the problems we are writing about occur both in monolithic architecture and microservices, regardless of what we scale.

I hope I helped :)

Stalso commented 4 years ago

Thanks a lot, @kgrzybek ! Another idea I have, is to use clustering possibilities of Quartz.NET or Hangfire. If we use it, we can use [DisallowConcurrentExecution] attribute even if we have instances of the same service, if we speak about Quartz.NET.

But in general, is ok to have so many polling processes? In fact, for each module we can have outbox, inbox and internal commands processor. Do not you think, that it will stress database?And all of this does not scale well.