BatchingSqlJournal writes prematurely sends WriteMessageSuccess before the transaction is commited

Arkatufus commented 3 years ago

In the current BatchingSqlJournal implementation, a batch chunk write is processed inside the ExecuteChunk method, which will read each entry inside the chunk and applies a handler method to each of the entries. For WriteMessages event, this handler method is the HandleWriteMessage which, unfortunately, treats each write request inside the WriteMessages event as a single write request and fires the WriteMessageSuccess event for each one, while the transaction itself has not been commited yet.

HandleWriteMessage should be changed so that it will do an "all or nothing" write of all of the requests. Any status events should also be fired in an "all or nothing" fashion at the end of the operation and any exception should be re-thrown to the calling ExecuteChunk method so that the transaction can be properly rolled back.

ismaelhamed commented 3 years ago

@Arkatufus , also https://github.com/akkadotnet/Akka.Persistence.SqlServer/issues/114#issuecomment-496951433

Arkatufus commented 3 years ago

@ismaelhamed yes, I think I know why, I'm still investigating it. It'll be a separate issue if I'm sure I've pinpointed the problem.

Arkatufus commented 3 years ago

@Aaronontheweb, @ismaelhamed Need to draft the fix for this issue:

What is a batch? Is it just a group of similar operation that is executed by a single command, or does it have to be atomic?
- If it has to be atomic, then transaction should be applied to the whole group, and all operations should fail/succeed at the same time.
- If it is just a group of operations, then transaction should be applied per operation, and each success/failure should also be batch sent to each interested parties at the end of the batch.
If it is an atomic group operation:
- How do we report failure? How do we mark which operation causes the failure to happen? This would need a special batch failure message that pinpoints the bad messages so only the responsible actors need to respond to it.
- Do we aggregate everything (run all operations even when a fail happened) and report each failing operations (pretty much the same as batching as non-atomic), or do we fail early and only report the first failing operation?
- If we do aggregate everything, then there would be a big performance cost early on, because we already know the batch failed, but we process everything anyway
- If we bail out early, there is a potential that there are other failure causing operation after the first one. This can create a really bad failure loop where the system kept sending and failing the same batch over and over, one after the other.

Arkatufus commented 3 years ago

The problem with current implementation: It does the batching as an atomic operation, but does not treat success reporting as an atomic operation, and it does not report any failures.

Aaronontheweb commented 3 years ago

What is a batch? Is it just a group of similar operation that is executed by a single command, or does it have to be atomic?

So the batching journal cheats, and lumps together multiple operations from different persistent actors into the same "atomic batch" in order to achieve higher overall throughput.

What I think is kind of garbage about the BatchingSqlJournal is how it groups operations together:

https://github.com/akkadotnet/akka.net/blob/1800e38870f2ef98272613327aefdbd9bf14766f/src/contrib/persistence/Akka.Persistence.Sql.Common/Journal/BatchingSqlJournal.cs#L884-L950

From looking at this - not clear if deletes, reads, and writes can all be included into the same SQL transaction - it certainly looks possible based on how that syntax is written, but whether or not those types of operations would actually appear inside the same chunk together isn't clear. If it's possible that reads and writes can all appear in the same transaction then that is total garbage. Those operations should at least be decoupled by operation type. No reason a recovery should fail because someone violated a unique key constraint on a write.

As for grouping "like kinded" operations together:

If it is just a group of operations, then transaction should be applied per operation, and each success/failure should also be batch sent to each interested parties at the end of the batch.

Everything in the transaction should succeed or fail together. Otherwise there's no point to having "batching" in the first place - the vanilla SQL journal already does the "right" thing in terms of transaction isolation. The point of the BatchingSqlJournal is to break that to try to get better performance. You're trading isolation guarantees in exchange for reducing total outstanding transactions per-ActorSystem and better throughput.

So I'd modify the BatchingSqlJournal to break out different operations from each other, because the way it's designed now is nonsense if it's possible for reads, writes, and deletes to be comingled. But I'd make sure that each individual AtomicWrite that is lumped into a single "batch" transaction all gets failed / rejected / succeeded as a group.

Arkatufus commented 3 years ago

Yes, in the current implementation, write, read (replay / query) and delete can happen in a single transaction.

Aaronontheweb commented 3 years ago

Yes, in the current implementation, write, read (replay / query) and delete can happen in a single transaction.

Yeah, that's crazy and won't even help with performance. Batching reads and writes together into the same transaction where the same table is being modified seems goofy at best.

Aaronontheweb commented 3 years ago

Batching reads and writes together into the same transaction where the same table is being modified seems goofy at best.

Although I should note - I may be surprised here if the benchmarks bear out something different, but I don't think reads and writes should be coupled together like this.

to11mtm commented 3 years ago

Yes, in the current implementation, write, read (replay / query) and delete can happen in a single transaction.

Yeah, that's crazy and won't even help with performance. Batching reads and writes together into the same transaction where the same table is being modified seems goofy at best.

Oh, if you start running persistent actors with aggressive journal sweeping (Think an AtLeastOnceDelivery actor where you're snapshotting every 50 messages and only keeping 2 snaps worth of journal entries) things get nasty pretty quickly on SQL Server. The risk of deadlock contention on a delete firing goes up under heavy writes, and with Batching journal this leads to writes failing, but because of the original title of this issue, yeah lots of haywire journal states can happen.

But agreed, Akka.Persistence.Linq2Db only batches writes (well, it technically batches reads as well, but the way SqlJournal does reads it doesn't really need to as badly) and that's the sane thing to do.

I'm guessing you'll see a perf boost on recovery with such a change, At minimum some improvement from not opening transactions, at best the Parallel recovery tests will see some gains as well (The batching journal does suffer on those.) As for the Premature WriteMessageSuccess, when that is fixed, I expect the single-Persist numbers of BatchingSqlJournal to drop down a good bit as a result, Multi-Persist possibly not quite so much.

Aaronontheweb commented 3 years ago

This was fixed in v1.4.19

akkadotnet / akka.net

BatchingSqlJournal writes prematurely sends WriteMessageSuccess before the transaction is commited #4941