Closed arielshaqed closed 4 weeks ago
We can do other things! Reducing contention is the cheapest experiment. If we do it we can still add better alternatives later.
Here are some alternatives in order of increasing difficulty of implementation.
Specific to merges. Every time a merge needs to be retried, its source will typically be further away from the destination. After all, some other concurrent merge affected the source, and if the user is not wasting too much effort then different merges do not perform the same changes.
So writing the merged metarange requires more effort on every retry. This is the opposite of fair! Whenever a merge retries it takes longer to write its new metarange. Meanwhile other newer merges can start from a closer source - so they have less work to do.
Rather than abandon the metarange from the failed attempt, we could:
We can additionally require some fairness. For instance, we might roughly queue branch updates. This will reduce the time spent by updates that take longer. I would expect it to affect p90 or p95 performance, but affect less the average time to wait.
This includes the proposed experiment and more. So it is more work.
Clearly doing this is harder than merely reducing contention. As an example, see the difference in implementation complexity between a mutex and a fair mutex.
Fairness implies some collection of pending branch update operations, typically a queue or something with similar characteristics. Rather than merely blocking one another, operations can advance the queue. In this version, branch updates are queued up. Workers scan these queues and perform the operations queued-up on each branch, in order. Contention is reduced because a worker can hold onto a branch for a long time and perform multiple operations without blocking anything. The web service threads only enqueue work and wait for it to be done.
This version has very convenient operational characteristics. Only branch update workers race to handles a branch, and it does not matter which one wins. Once a worker starts handling a branch it can keep going until it runs out of work or decides to handle another branch. So fairness is simple. Adding branch update workers causes some contention on the queues, but not on the operations.
This includes the proposed experiment and more. So it is more work.
We need an additional queuing system, and possibly a mailbox so the operation can block until a worker handles its request.
What
Under heavy load, lakeFS merges to the same branch slow each other down and can cause failure. One manifestation of such failures is in merges and other commit operations failing as "Locked".
Experiment with reducing contention on branch HEADs to improve performance and reduce error rates.
Branch updates
Any operation that changes the branch head is an update. These include merges, commits, reverts, and cherrypicks. The KV implementation of safe branch updates is actually simple:
Doing this ensures that the update is correct: it explicitly serializes all branches to the HEAD. However it also means that if more than one thread try to update the same branch concurrently, only one will succeed. So the work of all other threads is wasted. This includes the resources that these operations take, which include access to the KV and to the backing store, local disk IOPs, and network.
Reducing contention
Large-scale concurrent updates are mostly created by automated proceses (humans cannot create a large number of concurrent operations), and therefore it is reasonable to expect most of them to succeed. This experiment is to reduce contention and measure by how much it improves merge performance.