apple / foundationdb

FoundationDB - the open source, distributed, transactional key-value store
https://apple.github.io/foundationdb/
Apache License 2.0
14.5k stars 1.31k forks source link

Abort conflicting transactions before commit #3290

Open sfc-gh-anoyes opened 4 years ago

sfc-gh-anoyes commented 4 years ago

Transactions that have conflicts waste server resources. Maybe you do a bunch of reads and send a bunch of mutations to a proxy, only to need to repeat all that work. Or maybe your write conflict ranges even end up in a resolver's memory if you're running multiple resolvers, so you cause subsequent transactions to abort unnecessarily.

Proposal: Add a transaction option that changes the behavior of non-snapshot reads. When this option is enabled, and if the storage server can tell that a read request would have a different result at a later version, then that read would fail with not_committed.

Notes:

  1. Normally conflict detection only depends on read/write conflict ranges, which are not plumbed to storage servers. Storage servers would effectively have to infer write conflict ranges from mutations. This makes it behave differently from normal conflict detection, but I think this is OK since this behavior would be opt-in, and the resolver will still keep everything serializable.
  2. I think this could be implemented on the storage server by obtaining a view at the latest version, and then checking the insert version of each ValueOrClearToRef returned. I don't think the latest version used here needs to be committed, but it needs to be at least the read version supplied in the read request. (I'm probably missing all kinds of subtle things here but I think that's the gist)
  3. In addition to throwing not_committed, the client should add the conflict range to the special key space \xff\xff/transaction/conflicting_keys/
  4. This would also effectively improve commit latency since retries would happen earlier
xumengpanda commented 4 years ago

Follow up on Note 1: To get the performance benefit, it seems that the time for mutations in the committed transaction to reach SS must be shorter than the time for the get() in the conflicted transaction to reach the same SS. How often is that?

Another thought: Can we measure how much potential benefit will this bring in? For example, in a txn, we can record the time t1 when the first get() is ready and the time t2 when the txn commit fails. We may measure the (t2-t1) as the upper bound of the perf. benefit .

sfc-gh-anoyes commented 4 years ago

To get the performance benefit, it seems that the time for mutations in the committed transaction to reach SS must be shorter than the time for the get() in the conflicted transaction to reach the same SS. How often is that?

Mutations getting to storages needs to happen quickly (in a healthy cluster) for transactions to be able to read at their read versions. I see your point though - this probably doesn't have much benefit for transactions that take only a few milliseconds. This can have a large benefit for either long-running transactions or applications that re-use read versions.

Can we measure how much potential benefit will this bring in? For example, in a txn, we can record the time t1 when the first get() is ready and the time t2 when the txn commit fails. We may measure the (t2-t1) as the upper bound of the perf. benefit.

Worst case is 5 seconds. Any longer and the transaction would fail to commit with transaction_too_old.