Closed yuqi1129 closed 10 months ago
Overall Project | 59.1% -0.29% |
:green_circle: |
---|---|---|
Files changed | 12.77% | :red_circle: |
Module | Coverage | |
---|---|---|
core | 66.18% -0.93% |
:red_circle: |
@jerryshao Please help to give your advice about this PR. Only when everything is confirmed can we continue on this issue. Related problems that puzzle me have been listed, Your suggestion is helpful and valuable.
@jerryshao Please take time to see if this PR has any problem. I would add a new PR that depends on this PR
As I mentioned before, I would suggest you give an overall interface design before I can tell whether it is good or not. Currently, what I can see is only the id name mapping service and id generator. I would like to see how you design EntityStore and BinaryIdentifer interfaces.
You don't have to achieve the logic, all I want to see is the code organization and interface design.
As I mentioned before, I would suggest you give an overall interface design before I can tell whether it is good or not. Currently, what I can see is only the id name mapping service and id generator. I would like to see how you design EntityStore and BinaryIdentifer interfaces.
You don't have to achieve the logic, all I want to see is the code organization and interface design.
I see.
As I mentioned before, I would suggest you give an overall interface design before I can tell whether it is good or not. Currently, what I can see is only the id name mapping service and id generator. I would like to see how you design EntityStore and BinaryIdentifer interfaces.
You don't have to achieve the logic, all I want to see is the code organization and interface design.
I have checked the code several times, Refactoring id name mapping and id generator is indeed necessary and important. As referred to EntityStore
, The problem you care about is that some information has been passed to the EntityStore
which makes the interface messy and not very elegant. I have some ideas as follows:
EnityStore
is a generic interface and should not be associated with a specific storage kind like key-value store. So, I would not advise changing EnityStore
or HasIdentifier
directly to adapt to key-value store. What's why I'm unwilling to add a binaryIdentierName
method in HasIdentifier
;
Redundant messages like EntityType
, In my opinion, we can get rid of this dependency by modifying the key-value encode plan. Currently, The key of entities is encoded as the following format:
ml_{metalake_id}
ca_{metalake_id}_{catalog_id}
sc_{metalake_id}_{catalog_id}_{scheam_id}
sc_{metalake_id}_{catalog_id}_{scheam_id}_{table_id}
If we can assure that different entities will exactly have different name identifiers (namespace and name), we can freely remove the prefix that represents the type of entities. And we can adjust the key like:
1_{metalake_id}
2_{metalake_id}_{catalog_id}
3_{metalake_id}_{catalog_id}_{scheam_id}
4_{metalake_id}_{catalog_id}_{scheam_id}_{table_id}
Value 1
, 2
and other numeric values are the length of the name identifier. This kind of key-encoding solution can support point and range queries. In this way, we would not rely on EntityType
for key encodings.
Moreover, I think class type information passed to interface EntityStore
is also redundant if we view EntityType
as redundancy
Please help to evaluate the feasibility of this change. If this is OK, I would work on this problem soon;
EntityStore
seems to be tedious and not extensible. Adding an intermediate layer between EntityStore
and KvBackend
is weird. @jerryshao
Let me clarify the current requirements, basically the current implementation mixed several things together, let's break down the things:
Please think of breaking things into these components and to see how to organize the classes.
Let me clarify the current requirements, basically the current implementation mixed several things together, let's break down the things:
- Id generator. This is used to generate unique ID for every entity.
- Name-id relation maintainer. This component is used to maintain the relationship between id and name,and should be transactional.
- NameIdentifer/Namespace to encoded key mapping component. This component is to map the name identifier to binaries and vice versa (if required).
- kv store. kv store should support to use the encoded key to put/get entities from the kv storage. kv store can be embedded in the entity store as one implementation.
Please think of breaking things into these components and to see how to organize the classes.
Got it
Let me clarify the current requirements, basically the current implementation mixed several things together, let's break down the things:
- Id generator. This is used to generate unique ID for every entity.
- Name-id relation maintainer. This component is used to maintain the relationship between id and name,and should be transactional.
- NameIdentifer/Namespace to encoded key mapping component. This component is to map the name identifier to binaries and vice versa (if required).
- kv store. kv store should support to use the encoded key to put/get entities from the kv storage. kv store can be embedded in the entity store as one implementation.
Please think of breaking things into these components and to see how to organize the classes.
Hi, This PR contains the first 2 of the 4 points you mentioned above, and one already exists before and do not need big changes about it. For the last one, After the first three issues have been resolved, All we need to do is just assemble them together.
Please view it again and If you have any thoughts or ideas, please let me know. Thank you
I'm thinking that IdGenerator
and NameMappingService
are not specific to storage package, is it better to move package util
? This is a minor issue, others look good to me. What's your opinion?
I'm thinking that
IdGenerator
andNameMappingService
are not specific to storage package, is it better to move packageutil
? This is a minor issue, others look good to me. What's your opinion?
It seems to be very weird for me to put interface in package util
as package util
mostly contains utility class(class that holds static functions generally). So I do not see this point is reasonable. Maybe we can move to other packages or introduce a new one.
package util
is just an example, my point is that these two interfaces and implementations are not storage specific, what do you think?
package
util
is just an example, my point is that these two interfaces and implementations are not storage specific, what do you think?
These two Interfaces indeed are storage independent and can be moved to another package. But related implementation may be involved in storage. E.g. We would use a key-value backend to maintain name mappings say KvNameMappingService
Alright, let's put them here for now.
What changes were proposed in this pull request?
Add new auto id generator and id to name mappings interfaces. for more please refer to #228
Why are the changes needed?
Fix: #228
Does this PR introduce any user-facing change?
No
How was this patch tested?
No, No real change to the real code path.