apple / foundationdb

FoundationDB - the open source, distributed, transactional key-value store
https://apple.github.io/foundationdb/
Apache License 2.0
14.42k stars 1.31k forks source link

Reduce conflict rates via reordering transactions within/across batches. #2056

Open alexmiller-apple opened 5 years ago

alexmiller-apple commented 5 years ago

To record some of the conversation that @xumengpanda and I had at VLDB about Improving Optimistic Concurrency Control Through Transaction Batching and Operation Reordering and the overall topic of transaction reordering...

FDB already batches transactions and is optimistic, so we already fall into the space where these techniques are applicable. The described algorithms could be implemented in the resolver, and the proxy<->resolver protocol changed to allow the resolver to reorder transactions. However, we were concerned about the additional CPU usage on resolvers, since transaction reordering across two resolvers would probably have a higher conflict rate than no reordering on one resolver.

However, reordering transactions to maximally avoid intra-batch conflicts can be done on the proxy before even sending the transactions to the resolver. Any transactions which the proxy marks as aborted during this rearrangement don't even need to be sent to the resolver, which would actually be a direct speedup of the resolver. I'm far more comfortable with the idea of pushing the CPU cost of commits a big higher and requiring an additional proxy if that means fewer commits.

I think we also have an opportunity to reorder transactions across batches. Resolvers need to wait for each version's batch in version order, so there will be times when a resolver will have a queue of batches to process. Currently, we'd evaluate this queue of requests in order and independently. I think it'd be fine for us to move transactions across version batches as long as we communicate the final, correct commit version to the client. This would definitely have to be done on the resolver and cost resolver CPU, and I'm not sure how much we'd actually gain from doing so.

xumengpanda commented 5 years ago

A concern about avoiding (intra-batch) conflicts is how much benefit it can bring to us in a heavy-conflict scenario and how much overhead it causes in a light-conflict scenario.