vesoft-inc / nebula

A distributed, fast open-source graph database featuring horizontal scalability and high availability
https://nebula-graph.io
Apache License 2.0
10.82k stars 1.2k forks source link

Support in/out degree on vertices #193

Open sherman-the-tank opened 5 years ago

sherman-the-tank commented 5 years ago

On each vertex, we need to record in/out degree for each edge type

ayyt commented 5 years ago

Is this issue doing?
If not, I want to try after finishing TTL feature.

sherman-the-tank commented 5 years ago

Sure. We want to include this in beta release.

@dangleptr and I had an offline discussion about this. I think we are pretty clear on how to achieve this. Please talk to me or @dangleptr

ayyt commented 5 years ago

👍,I am going to communicate with you or @dangleptr in the next few days, thx.

ayyt commented 5 years ago

@sherman-the-tank Yes, communicate with your offline, I understand the principle of implementation.

There is a question that needs to be confirmed with you. The information on the out-degree and in-degree of the current vertex is to be maintained in real time. When the load is loaded with a large amount of data, will the performance be lost a lot?

sherman-the-tank commented 5 years ago

@steppenwolfyuetong Very good question. Let's think through this tonight and have a chat tomorrow

ayyt commented 5 years ago

I am very much looking forward to chat, I can do at any time.

ayyt commented 5 years ago

@sherman-the-tank I have two questions about the in-out degree statistics on vertices:

1) The storage layer stores the keys of the vertex data as follows: partId(int32)_vertexId(int64)_tagId(int32)_tagVersion(int64)

Why don't we add two int64 fields after the key to indicate in degree and out degree of the vertex? Is this more than a single vertex that stores two key values to indicate the in degree, out degree, and save a lot of storage space?

2) If the in-out degree of a vertex are stored separately, they need to be stored in the storage layer together with the vertex data, and stored in the corresponding part as the data? I designed the following: I feel not very good, ask you:

Key value

partId(int32)__degrees__vertexId(int64)_edgeType(int32) out_degree(int64)

partId(int32)__degrees__vertexId(int64)_edgeType(int32) in_degree(int64)

__degrees__ distinguish in-out degree from vertex data

ayyt commented 5 years ago

Already communicated offline.

whitewum commented 5 years ago

Already communicated offline.

any disscusion summary for us to catch up?

ayyt commented 5 years ago

Summarized as follows: 1) In order to keep the vertex key value stable, the vertex's in-out degree are counted separately using a separate key value.

2) the vertex's in-out degree Key value is currently designed as followss: partid + vertexid + edgetype + “in” partid + vertexid + edgetype + “out

@whitewum Above the current design, I will first make a demo to test.

sherman-the-tank commented 5 years ago

After discussing with @dangleptr yesterday, we consented that we might need to step on brake on this issue. The current design of the in/out degree conflicts (or at least added a lot of complexity) with the TTL feature