Closed EtherZa closed 3 months ago
Thanks for looking into this.
Yes, I believe the outbox could have deserved a performance Improvement pass since it's inception.
I wanted to improve the performance of outbox especially when going into millions of messages and database contention in another project that SMB is used we saw some potential for improvement. So this should align with that I believe.
The batch send was also something on the roadmap, so it's another good avenue.
Let me take the time to review it in depth and align the implementation.
I would also love to add some performance test to he able to baseline before and after.
@EtherZa you're definitely on the right track. I've reviewed the PR and provided some suggestions to consider.
As a side note, the bulk produce interface might be leveraged down the road in v3 to be able to send a batch of messages, or when (future releases?) we will bring in an auto batching of single message publishes.
I would love to see some benchmark test (as an integration test) using the existing Outbox infrastructure on Azure. I can take that on in a second iteration of this feature.
I would also love to add some performance test to he able to baseline before and after.
I hear you with this, but TBH, I don't know how/what to benchmark. The gains are primarily through reducing the number of calls over the network and the cumulative latency between those calls. Mocking out the calls would nullify the benchmark and leaving them in puts the weighting on latency which is dependent on available bandwidth and round trip time.
I would love to see some benchmark test (as an integration test) using the existing Outbox infrastructure on Azure. I can take that on in a second iteration of this feature.
For my own testing, I used the Given_CommandHandlerInTransaction_When_ExceptionThrownDuringHandlingRaisedAtTheEnd_Then_TransactionIsRolledBack_And_NoDataSaved_And_NoEventRaised
test and manually dumping several thousand messages into a local containerised db. The difference is chalk and cheese.
As a side note, the bulk produce interface might be leveraged down the road in v3 to be able to send a batch of messages, or when (future releases?) we will bring in an auto batching of single message publishes.
Great stuff. There are further improvements that could be made with the outbox model too.
OutboxSendingTask
can vastly increase the polling interval if the OutboxForwardingPublishInterceptor
were to emit an event after the "outboxing" transaction completes. Each balanced app instance could immediately pick up that it had placed a message in the outbox, with others checking in periodically. This could be extended with support of asynchronous notifications à la postgres notify
.LockExpirationBuffer
window. The LockExpirationBuffer
approach is risky if one part of the operation (publish or SQL sent status) is partially complete.Battles need to be picked though :)
I have rebased from master and the run the first set of benchmarking tests that @zarusz added -thank you for adding them.
The results are skewed as the outbox is unfairly advantaged by having vastly lower latency and higher bandwidth available to it than the ASB, but do show an improvement.
Each set od results are from three consecutive runs on a release build using a local SQL instance (containerised) with an Azure Service Bus (Standard tier) in the Australia East region. I have +- 78ms latency with the Australia East data centre (https://www.azurespeed.com/Azure/Latency).
direct
Message Publish took : 00:00:05.1472734 00:00:04.8023864 00:00:04.8906686
Message Consume took : 00:00:08.2347485 00:00:07.9732388 00:00:07.1775174
with outbox
Message Publish took : 00:00:01.1531716 00:00:00.7929830 00:00:00.7882242
Outbox Publish took : 00:01:14.1196937 00:01:14.7346352 00:01:15.8530969
Message Consume took : 00:00:06.9526368 00:00:06.3923101 00:00:06.6098189
direct
Message Publish took : 00:00:04.8164905 00:00:04.6085346 00:00:05.1897898
Message Consume took : 00:00:06.8671870 00:00:06.6343886 00:00:07.3392817
with outbox
Message Publish took : 00:00:01.3155528 00:00:00.7840798 00:00:00.8276036
Outbox Publish took : 00:00:04.0035331 00:00:04.6453286 00:00:04.5108678
Message Consume took : 00:00:06.6543769 00:00:06.9201661 00:00:06.8933468
There are more changes needed before this will be ready to merge
I am busy implementing the change to the three items above and will push asap.
Out of interest, I also ran single batches of 10k and 100k messages for the "with outbox" test only.
Message Publish took : 00:00:08.2749961
Outbox Publish took : 00:00:18.5741174
Message Consume took : 00:01:07.0192878
Message Publish took : 00:01:20.7375655
Outbox Publish took : 00:02:59.2045206
Message Consume took : 00:11:00.8795735
@EtherZa It is exciting to see way better numbers after your changes, good job!
A couple of thoughts:
Ping me here once you feel you're done with changes for this iteration and I can review once more.
@zarusz I have pushed a commit for three items above. Please let me know if you are happy with the change, and I will rebase and sign. If you would prefer to defer it to another PR in order to limit scope/provide an alternate solution that's fine too -I can remove the commit easily from here.
Latest change includes:
IMessageBusBulkProducer
. OutboxSendingTask
can now be understood at a glance :)LockExpirationBuffer
)ServiceBusException
(ServiceBusFailureReason.ServiceBusy
). This should probably sit behind a gate to ensure all processes adhere to the cool off, but for now only the current execution path is affected.It might be good to capture the numbers (from your local and/or from the github actions) and to make a note on the Outbox docs - so that folks have a feel about the overhead.
I think my timings are probably a little too biased. While there is a substantial performance improvement, mine are definitely stacked against the real world with the database being local and the svb being remote. GitHub actions would be better as there is no favour and they are repeatable/independent ...the only problem is that the timings are not being published (no logs for successful tests).
I might consider refactoring and segmenting the bulk vs non-bulk capable transports (low on my priority list). In the future (perhaps v3) elevating the core interfaces to be able to produce a batch of messages which would go hand in hand with this PR.
Supporting batch uploads to SVB/SQL would be a nice to have and would definitely be useful but probably not used too often.
In terms of performance, the impact from the boxing looks negligible IMO. It is too closely aligned with IO to make any tangible difference.
Before - Message Publish took : 00:00:05.1472734 00:00:04.8023864 00:00:04.8906686
After - Message Publish took : 00:00:04.8164905 00:00:04.6085346 00:00:05.1897898
Looking at the consumer times for 10k v 100k for ASB it would require adding partitioning and more concurrent consumers to let it scale better.
As long as a single service can publish more messages from the outbox than that service can consume, I can't see an issue. It could become a problem if order is important though. It would kill scaling out in its current form. In that case, it would be better to lock by bus/path or custom strategy, so that multiple instances could each deliver for different streams.
Please let me know your thoughts and I'll take it from there.
@zarusz Comments resolved, test added and rebased
Thanks for this massive PR @EtherZa!
Thanks for this massive PR @EtherZa!
@zarusz It's a pleasure. Thanks for the help and library :)
This comes as (another) unsolicited/undiscussed PR after investigating some performance issues that I was experiencing. Please feel free to be absolutely brutal with it and/or any aspect that is not to your liking.
The most jarring change is probably going to be with the addition of
IMessageBusBulkProducer
and the required changes toMessageBusBase.ProduceToTransport
to allow batching and access fromOutboxSendingTask
.Changes:
OutboxSettings.MaintainSequence = false
will allow for load balanced instances to process the outbox together (default is true to keep same behaviour).MessageBusBase.ProduceToTransport
signature has been changed to accept multiple messages for the same bus/path. Serialization has also moved into the method to allow the same.OutboxForwardingPublishInterceptor
is ordered with max int to ensure that it is the last interceptor to run. Publishing of messages from the outbox do not run through the interceptor pipeline (already done so before being placed in the outbox).The submission is lacking in additional unit tests at the moment (can be added once direction is determined), but existing tests should cover most scenarios as the flow should not change. One test was removed as it was no longer relevant.
Again, I completely understand that this is undiscussed. It was thrown together while looking at what could be done to improve the implementation, so please feel free to fob it off as heading in the wrong direction or give any changes that you feel would provide something more to your liking. -Richard