apache / incubator-graphar

An open source, standard data file format for graph data storage and retrieval.
https://graphar.apache.org/
Apache License 2.0
218 stars 46 forks source link

[Feat] Support multi-labels for a single vertex/edge #96

Open acezen opened 1 year ago

acezen commented 1 year ago

Is your feature request related to a problem? Please describe. For the graphs in Neo4j or NebulaGraph, a single vertex or edge can have multiple labels. For example, a vertex in Neo4j graph could be labeled as a person as well as a student, thus it has two labels: person and student. While currently, in GraphAr, a vertex or an edge can have only one label. GraphAr needs to support multi-labels for aligning with Neo4j and Nebula.

Describe the solution you'd like

more detail: use Parquet as example, we can storing each label as one separate column and use Run Length encoding as the encoding of label column. When you want to check a vertex is label person or not, just check that encoding is 0 or not in the person column. It is convenient and fast to use this method to filter vertices of specific label.

Describe alternatives you've considered Storing a label list (which is complex Array type) as a property on vertices/edges.

yixinglu commented 1 year ago

Prefer to store one label per column for scanning vertex/edge values by a specific label in order to obtain better performance.

lixueclaire commented 1 year ago

@freshyl @KateHed Can you help on this issue?

spmallette commented 4 months ago

We seem to be reaching a point where more and more graphs are supporting multiple labels. Amazon Neptune would be another one that has this feature. I think TinkerPop will likely support this feature in the future given that there are so many graphs that feature it.

Elssky commented 1 month ago

claim this task🙋🏻‍♂️