Open jerryshao opened 3 months ago
CC @xloya @yuqi1129 , please take a look when you have time.
It's a big work to change everything, we can separate it into small tasks to iterate one by one. Currently, there's no functionality issue here since we have locks to guarantee the concurrency.
I have no clear answer for now. I think we can do some tests to see if transaction is OK or not. But for join I think we can do some changes somehow. For example, like delete operation, we can use join to do it and check the return value, you can check my code #4019 , I use join to handle some cases. @xloya
What would you like to be improved?
In the current implementation of xxxMetaMapper and xxxMetaService, we seldom use JOIN and transactions, for example:
The list catalog operation will issue two SQL queries, the first one is to get a metalake id, then using this id to get the catalog list.
Instead of issuing two queries, we can use JOIN to join catalog and metalake table to get catalog list by one query, this will save the IO time and avoid inconsistent problem.
Also for
updateCatalog
We have several steps:
These 4 steps are not in the transaction, which will potentially meet the inconsistent problem, and we highly leverage the lock to avoid the concurrent problem. A better solution is to put these 4 steps into a transaction.
How should we improve?
So basically, we can improve the current SQL to: