Particular / NServiceBus

Build, version, and monitor better microservices with the most powerful service platform for .NET
https://particular.net/nservicebus/
Other
2.06k stars 651 forks source link

Support dedicated Outbox database #4665

Open timbussmann opened 7 years ago

timbussmann commented 7 years ago

Currently the Outbox requires business data to be stored on the same database as the Outbox data in order to provide the stated guarantees.

There seems to be a use case where you want to separate these databases (even using different databases) and use DTC to synchronize those two databases. This setup can make sense when using a non-DTC transport but using DTC in general is not an issue. See this conversation for more detail: https://github.com/Particular/docs.particular.net/issues/2483

It seems this scenario is currently not supported as there is no way to open a TransactionScope in the right scope:

On first thought, something like the following steps would enable the described scenario:

  1. receive incoming message
  2. start transaction scope
  3. start outbox storage transaction
  4. invoke handlers (and use business data storage transactions)
  5. commit transaction scope (commit outbox and business data storage before)
  6. dispatch messages
  7. mark messages as dispatched

Thoughts @SzymonPobiega @Particular/nservicebus-maintainers cc @Niklas-Peter

yvesgoeleven commented 7 years ago

Given that the outbox feature has been created to be able to move away from DTC and that we heavily rely on this fact for environments where DTC cannot be used (e.g. azure). We must make sure that adding this feature does not prevent endpoints from running in these environments.

Furthermore it should be tested with all technology that actively prevents resources from enlisting in transactions (like ASB transport)

andreasohlund commented 7 years ago

We must make sure that adding this feature does not prevent endpoints from running in these environments.

One potential solution is to add the needed extensibility (new pipeline stage) to the core but have relevant persisters provide this mode (Sql, NH, Raven)

yvesgoeleven commented 7 years ago

Also, make it opt-in. Relying on DTC is never a good idea as it reaches it's limits quite quickly. 2PC the protocol behind DTC is an expontential communication protocol. Each resource manager involved in the transaction has to communicate succesfully with all other resources managers twice. So that means that 1 transaction, spanning 2 db servers and a queue, will cause 8 communication messages behind the scenes. If one of these fails (e.g. because a network partition), then the transaction will go in doubt and never complete/timeout. The more resources involved and the bigger the odds of network partitions (huge odds in azure/amazon) the less likely DTC will work properly.

This is a big reason (but not the only) of why most modern brokers and even databases (like sql database in azure) do not support DTC.

SzymonPobiega commented 7 years ago

Generally I am 👎 for the separation of outbox and business data. The only argument for having separation is to not store excessive amounts of outbox-related data in the business database (outbox can grow quite big in a high-volume endpoint).

This problem can be addressed by using Outbox-Inbox combo which I have proposed some time ago for SQL persistence: https://github.com/Particular/NServiceBus.Persistence.Sql/pull/58. TL;DR the deduplication data is kept separate (and referred to as Inbox). The Outbox is very small and limited to stuff that is actually not yet dispatched (which means <= MaxConcurrency). Furthermore, the Inbox can be located in a different storage without the need to have DTC between these storages. Outbox data (small and bounded) it still kept in the same store as the business data.

What's more, in this design the outbox can be built-into sagas (or any business entity for that matter) as there is only one batch of undispatched messages to be stored there so impact on max entity size (important for azure) is pretty small.

In the multi-endpoint system it might actually make sense to have a shared (and large, fixed-size) inbox and outboxes in each endpoints' DB. This way the fixed-size character of the inbox becomes an advantage and the average total throughput is stable and the size of the inbox can be calculated easily.

timbussmann commented 6 years ago

We decided to move this into the platform dev repository as this feature affects more than just core. I'm closing this issue as prioritization will be handled by platform development. For discussions and thoughts please feel free to continue commenting on this issue as pdev is currently not publicly available.

kbaley commented 1 year ago

To keep the discussion public, we're re-opening this. It's still a platform level issue but the initial analysis can be done with NServiceBus and we can see where it goes from there.

SzymonPobiega commented 1 year ago

@kbaley if we are making it public (👍 💯 ) I think we should quote here all the facts we learned about the subject to provide the context for everyone to discuss.