Open dlmarion opened 2 years ago
@keith-turner @cshannon - Is this still an issue after the recent Fate changes? Can multiple Fate transactions run concurrently on different parts of the table? I'm thinking the answer is yes with the new operation_id column in the tablet metadata in v4.0.
Yeah I would say that the opid column is the table range lock as it allows locking tablets for fate operations. Some fate operations stll require locking the entire tablet (like delete table, clone, etc) but there is no getting around that. I can let @keith-turner comment too but I believe operations like Split, merge, etc that set that operation id should be able to work concurrently.
I looked into removing the zookeeper table locks and ran into two problems.
The split code in accumulo 4 avoids using table locks, but that required adding complex code to coordinate with the table state stored in zookeeper.
The biggest advantage for table locks having a range would be allowing merge to run on one part of table and bulk import and compaction to continue to run on another part of the table. But now that merge operations are much faster this use case is not as compelling. Compactions and bulk imports get read locks on tables, so those can run concurrently.
Looking through all of the table operations that get write locks, found the following.
Merge seems to be the only operation that would benefit from a range on the table lock. Create,delete, clone, etc would all lock the full range of the table if locks had a range. Since merge is faster now and does not need the ranges, maybe there is no good use case for this feature.
There is a caveat though. Even though merge is faster, once a merge is initiated it will wait for running bulk imports and table compactions to complete. While its waiting it will prevent new ones from starting. So it could cause a disturbance in throughput.
Is your feature request related to a problem? The current table locks do not allow FaTE transactions operating on different parts of a table to run concurrently.
Describe the solution you'd like Allow for different FaTE transactions to be able to operate on a table concurrently when they affect different ranges
User feedback for this issue captured at https://github.com/apache/accumulo/pull/2467#discussion_r801868052