Open lucavice opened 4 months ago
total execution for a similar range as previous test that includes the test is only 42 seconds, instead of 4 hours.
that is pretty amazing
shouldn't I have seen improvements for the Akka v1.5 tests with Akka.Persistence.SqlServer if the Akka.Persistence.Query improvements are separate from the plugin?
yes you would have, so I suspect this has more to do with the way Akka.Persistence.Sql reads the data than it does with the number of queries being executed.
I wonder how much faster this would be if you used the tag table
I think the big impact might be how the Akka.Persistence.Sql implementation is bounding its queries more carefully:
[r].[Ordering] > @Offset AND
[r].[Ordering] <= @MaxOffset
versus
WHERE e.Ordering > @Ordering AND e.Tags LIKE @Tag_sized
The TAKE
parameter is supposed to cap the number of results retrieved, but this subtle change in how that result set is bounded in the first place can have a big impact on the query execution plan.
I think the big impact might be how the Akka.Persistence.Sql implementation is bounding its queries more carefully:
agreed, also as a 'yes and' I will note is in Akka.Persistence.Sql we prefer pulling rows into batches with binary data to deserialize in another stage, versus Akka.Persistence.Sql.Common
's behavior (at least last I knew) of deserializing right from the DbDataReader
. The akka streams implementation gets us lots of code and data cache locality benefits, at the expense of memory alloc.
Additionally, there's the side-benefit that due to the 'copy-to-buffer-object' behavior that the query itself is open shorter, easing stress on connection pools and overall network traffic.
Well, well done @to11mtm . It's crazy that the impact is so big, and it's not even an explicitly documented improvement, just a side effect of coding things in a different, more efficient way.
Hi all,
as promised here, I'm sharing a bit of results of load testing on our system with the aim of comparing performance on our migration from the old
Akka.Persistence.SqlServer
plugin toAkka.Persistence.Sql
.We performed the testing on
Akka v1.5.26
on Azure using Azure Sql Database (set to 50 DTUs, which is fairly low in the scale). We did not use the optional Tag Table performance improvement in this test. The test involves writing into the journal about 2000 events in a few seconds, which then need to be read by about 35 different event processors (which are mostly projections in our event-sourced system), which useEventsByTag
to get access to the stream of events.Run 1:
Akka.Persistence.SqlServer
In this run using Akka.Persistence.SqlServer, the database capacity was saturated for almost 10 minutes. Even though the actual writing of the events was over within a few seconds, the journal reading seemed to take a tremendous toll on the resouces. All the projections were significantly delayed.
By looking at Azure Performance Insights, we can get some hints in what was the most expensive query. We can see that during the test, Azure marked query 25925 aggregate execution time for almost 4 hours. Although the insights functionality does not let me narrow down exactly to only the test, we can see from the graph how most of the 4 hours execution was spent during the test duration.
Query 25925 is the following:
So we see that this query is the one generated by
EventsByTag
for scanning the journal by tag.Run 2:
Akka.Persistence.Sql
We repeated the same test, using Akka.Persistence.Sql instead. The outcome is completely different, as resources briefly peak during the minute the test executes, but then goes back to normal.
Digging into the performance insights, this time we see query 43928 being the most resource-intensive, but total execution for a similar range as previous test that includes the test is only 42 seconds, instead of 4 hours. We also see how there is a bit of a peak of usage during the test, but overall not much more than the baseline when the system is idle (the system still polls the journal every second for new events, even if there are none).
Query 43928 is the following:
The query looks fairly similar to the previous run and I can't quite tell why it is so much more performant than the previous one. The number of executions seem to be comparable. I'm starting to suspect this is due to the Akka.Persistence.Query backpressure improvements in 1.5, rather than this plugin in itself. But I'm not sure, shouldn't I have seen improvements for the Akka v1.5 tests with
Akka.Persistence.SqlServer
if the Akka.Persistence.Query improvements are separate from the plugin?Conclusion
The performance benefit for our reads is just ludicrous, a 350x reduction in aggregated execution time. Perhaps the way we coded our event processors using
EventsByTag
wasn't ideal to start with, but it seems that our use case was just perfect for the type of improvements that this new plugin brings (or the Akka.Persistence.Query improvements). Luckily, we don't normally write thousands of events in short burst in production, so that wasn't an issue in our platform, but this definitely gives us peace of mind that we have a much bigger room for volume growth than before. We originally planned for enabling the Tag Table, but at the moment the new performance seem to be more than adeguate even without that feature.