tikv / sig-transaction

Resources for the transaction SIG
63 stars 13 forks source link

Prove correctness of current implementation of handling non-globally-unique commit_ts #46

Closed MyonKeminta closed 4 years ago

MyonKeminta commented 4 years ago

To implement async commit, we need to support using a timestamp that's not a globally-unique tso as a commit_ts. The hardest problem we meet is that the key collision in write cf. This problem was solved in this PR. What it does is:

  1. For all operations that need to write a rollback record, check write cf before writing it. If found another transaction's write record who has the same key and commit_ts with the current rollback operation, then add a has_overlapped_rollback flag to it if this is a protected rollback, or do nothing if not protected.
  2. For all commit operations, we found that we can avoid all possibilities that a commit overwrites an already-existed rollback record. So we don't need to do any additional checks.
  3. Also note that that PR allows prewritting on a key where there's already a commit record with commit_ts equals to the current start_ts, as long as the record is not a rollback record, nor does it have its overlapped_rollback flag set. Previously in this case it reports WriteConflict.

The problem is, I think, that we need to prove the second point above in a more strict way (like TLA+). For reference, here follows how we came into the conclusion that commit operations doesn't need to check overlapping rollbacks.


Consider we are performing a commit operation for transaction T2 on a key, and another transaction T1 also affects this key but rolled back.

First of all, we can push the max_read_ts before performing rollbacks, so prewrites after the rollback will surely have its min_commit_ts greater than T1.start_ts. So we just need to consider what will happen if on one of the keys that are affected by both transactions, T1's rollback operation happens between prewriting and committing of T2.

  1. If T1 is an optimistic transaction, then its rollback record can be safely overwritten by T2's commit, since it can still cause WriteConflict to T1's later-coming prewrite requests.[1]
  2. If T1 is a pessimistic transaction, then it need to write rollback records only when it has entered the prewrite phase, which means, it should have already successfully acquired all pessimisitc locks. So there's two cases about this, according to if T1 needs to acquire a pessimistic lock on the current key:
    • 2.1. If T1 needs acquiring pessimistic lock on the key, then the rollback must happen after acquiring pessimistic lock on the key. The pessimistic lock cannot be acquired after T2 prewriting, since T2 prewriting writes a lock on the key. If the pessimistic lock is acquired before T2 pewriting, and T2's prewriting cannot succeed before T1 rolling back. So in this case it's impossible that T1's rollback happens between T2's prewrite and commit.
    • 2.2. If T1 doesn't need the pessimistic lock, then it must be an index key, therefore T1 and T2 must have another key in common that needs acquiring pessimistic lock. Since the index key should not be T1's primary key, so if T1 is a non-async-commit transaction, we don't need to care about if its rollback record on secondaries is overwritten. However if T1 is a async-commit transaction, it may need to write protected rollback record on secondaries when someone calls check_secondary_locks on it. (WIP here...)[2]



[1]: This is currently broken by allowing optimistic-prewriting on a key where there's already a commit record whose commit_ts equals to current start_ts, as long as the commit record is not rollback (here). This should be fixed later but is relatively easy, hopefully.

[2]: I just found this case seems to be incorrect when I was drafting this issue. We need to confirm and find someway to fix it if necessary.

sticnarf commented 4 years ago

In case 2.1, after T2's rollback and T1's commit, the stale pessimistic lock request of T2 can succeed, and T2 may commit successfully. I think this can also be a source of incorrectness.

MyonKeminta commented 4 years ago

We don't need to go further now. There's a simple solution that solves all possibility that commit may overwrites rollback, thanks for @youjiali1995 : when protected rollback happens, if there's another transaction's lock on the key, put the current ts into the lock. Then when committing, if the commit_ts is contained in the lock, mark the overlapped_rollback flag on the commit record.