apache / incubator-hugegraph

A graph database that supports more than 100+ billion data, high performance and scalability (Include OLTP Engine & REST-API & Backends)
https://hugegraph.apache.org
Apache License 2.0
2.62k stars 517 forks source link

[Feature] The gaph in-memory projection and and its corresponding algorithm #2359

Open rewangz opened 9 months ago

rewangz commented 9 months ago

Feature Description (功能描述)

问题描述

我司决定采用HugeGraph作为数据库,并在使用过程中需要频繁调用s-t最短路径,但是我发现最短路径算法性能在大图中很慢几乎无法使用。为了优化算法,我参考了Neo4j GDS,并在内存中实现了针对s-t边的持久化投影以重新实现了源到目标的迪杰斯特拉算法,同时还进行了一些基本的优化,包括优化的优先队列等。这些改进使得投影后的源到目标迪杰斯特拉算法的执行时间稳定在毫秒级。

我希望将我的代码贡献到HugeGraph中,以便更多用户受益于这些性能优化。

目前的问题

目前,我的代码(包括图的投影和基于投影的迪杰斯特拉算法)都存放在 hugegraph-core/src/main/java/org/apache/hugegraph/traversal 目录下。然而,我认为这不够合理,至少对于图的投影而言,因为可以优化更多的算法,应该将其放置在一个更大的包中。

提议

我希望将我的代码重新组织,并将图的投影放置在一个更合适的大包中。然而,我对于在HugeGraph中的代码组织结构并不了解,所以我需要一些建议。

具体来说,我想知道:

  1. 在HugeGraph中,应该将图在内存中持久化投影的代码放在哪个目录?
  2. 是否有任何特定的代码组织规范或命名约定我需要遵循?
  3. 是否需要对HugeGraph的核心代码进行修改以适应我的贡献? 感谢您的指导和建议,我期待将这些性能优化的代码成功融入HugeGraph中。
imbajin commented 9 months ago

Thank you for your willingness to use and contribute. The community welcomes everyone to join and provides relevant contextual references

Some refer:

rewangz commented 9 months ago

Thank you for your willingness to use and contribute. The community welcomes everyone to join and provides relevant contextual references

Some refer:

似乎computer是基于分布式实现的(这会使更新每个被扩展的节点),而s-t最短路径一般情况下并不会实际上扩展如此多的节点。而分布式对于实现全部节点的单源最短路径更加适合(neo4j的gds中s-t也只有单线程实现,并且有非常好的效果)。我的方法是优化数据结构来扩展速度而并非分布式计算来加速,而computer是基于分布式框架的,因此我想知道我应该如何选择:

  1. 我应该在computer中实现单机算法
  2. 将我到代码提交到server
  3. 利用数据结构优化computer已经提交的分布式算法(https://github.com/apache/incubator-hugegraph-computer/issues/283

这也是我这几天一直在纠结的问题。

simon824 commented 9 months ago

Hi @rewangz , I think you can first commit your code according to your own mind. During the code review, the reviewers will provide more reasonable suggestions and then you can improve it based on that feedback. It will be more convenient for everyone. As for the computer side, you can optimize it after completing this server pr if you are interested in it.

By the way, it is recommended to communicate in English in Apache projects and thank you for your contribution.

simon824 commented 9 months ago

Hi @rewangz , how's it going? if u need any help, just contact me via WeChat: shirning

ZeeJJ123 commented 5 months ago

@dosu-bot Could you please share the location of the file containing the shortest path algorithm and walk me through its implementation?