Closed andreimatei closed 2 years ago
@andreimatei and I have been discussing this in private over the past few days, but it's worth keeping the conversation public.
I believe that currently this behavior cannot result in very bad outcomes (in particular, atomicity violations), since the txn's writes only become visible to others once the txn recovery protocol successfully moves the txn record from the STAGING to the COMMITTED state.
This is also my understanding because we currently have asymmetric visibility into the state of a transaction commit between the transaction coordinator itself and onlookers. The coordinator itself will consider a transaction committed when it reaches the implicit commit state. An onlooker (pusher) will only consider it committed after it has recovered the transaction record and moved it into an explicit commit state. Similarly, a transaction coordinator will only be able to begin resolving its intents once its transaction record is in an explicit commit state. So this protects us from any data corruption due to atomicity violations.
However, I do think it's possible that this could lead to this confusing (soft) assertion: https://github.com/cockroachdb/cockroach/blob/96f8f33dceeec0ad6168e2dffa98b952892bdf7b/pkg/kv/kvserver/batcheval/cmd_recover_txn.go#L106-L120
We don't seem to have any reports of that, but it's possible.
We should fix it by preventing transitions from STAGING->ABORTED if the implicit commit condition is satisfied. This would probably entail having an EndTxn(Rollback) that encounters a STAGING record kick off txn recovery (unless the client knows that it will fail). And maybe also the client shouldn't send a rollback in the first place if the status is ambiguous; the client could kick off recovery itself.
This is pretty much my thinking, although I'd make it more general so we can also fix https://github.com/cockroachdb/cockroach/issues/48302 at the same time. Let's update the txnCommitter
to keep track of when a transaction has been committed (implicitly, explicitly, or ambiguously). This should be easy to do here:
https://github.com/cockroachdb/cockroach/blob/96f8f33dceeec0ad6168e2dffa98b952892bdf7b/pkg/kv/kvclient/kvcoord/txn_interceptor_committer.go#L209
We can then reject any other requests that pass through it.
And just to be explicit here, I think any solution needs to incorporate cooperation from the client. In fact, I'd prefer solving this in the client entirely. We're not operating in a byzantine model where we need to protect against clients going crazy and violating atomicity intentionally. So the simplest fix is to block out incorrect rollbacks right in the TxnCoordSender
(or its interceptor stack).
However, for the sake of argument, if we didn't want to solve this in the client, I think the solution here would be to force any STAGING -> ABORTED
transition to go through the txn recovery procedure. That solution seems incorrect to me though. If we do that, then why aren't we forcing STAGING->COMMITTED
transitions to do the same thing? Couldn't the client have gotten that wrong as well?
A transaction is implicitly committed if it has a STAGING txn record, and all its in-flight writes were successfully written (at copacetic timestamps). We like to think of a transaction that satisfies this implicit-commit condition as committed. Surprisingly, it turns out that that coordinator can rollback an implicitly-committed txn. If it sends a rollback, nothing will stand in its way and the transaction will be cleaned up. This is definitely not what we've intended. I believe that currently this behavior cannot result in very bad outcomes (in particular, atomicity violations), since the txn's writes only become visible to others once the txn recovery protocol successfully moves the txn record from the
STAGING
to theCOMMITTED
state. The recovery protocol will error out with an assertion failure if it loses a races to a rollback.Even though currently atomicity is preserved and I guess the only side-effect of this rollback is that assertion error being encountered by concurrent recoverers, we don't like it that resolving implicitly-committed intents need to wait for the replication of the
COMMITTED
transaction. What we'd want would be to allow both the coordinator and other pushers to resolve intents as soon at the implicit commit condition is proved to be satisfied - so for example we'd like to run theRecoverTxn
request in anAsyncConsensus
batch. We cannot currently do that because we'd also need to ensure that multipleRecoverTxns
all succeed even if some intents have been resolved. In order for that to be the case, we need to store the txn id in committed values (talked about around #26915). Anyway, that's the forward-looking plan, and this rollback behavior is a major fly in the ointment because, at that point, it would result in atomicity failures.We should fix it by preventing transitions from STAGING->ABORTED if the implicit commit condition is satisfied. This would probably entail having an
EndTxn(Rollback)
that encounters aSTAGING
record kick off txn recovery (unless the client knows that it will fail). And maybe also the client shouldn't send a rollback in the first place if the status is ambiguous; the client could kick off recovery itself.