KnightKingWalk / KnightKing

A general-purpose, distributed graph random walk engine.
Other
110 stars 30 forks source link

Gemini evaluation code #9

Closed lushl9301 closed 4 years ago

lushl9301 commented 4 years ago

Hi, will you opensource / share the code you modified in Gemini for reference?

Thanks

ykwd commented 4 years ago

Sorry it's not available now.

As I remembered, the major modification on Gemini's core/graph.hpp includes:

1). Force to use sparse mode. The dense mode doesn't work because random walk messages cannot be combined. 2). Change the size of message buffer, because the local walkers may be more than local vertices.

On top of it is the random walk logic. As the edges of one vertex in gemini are scattered at different nodes, so two layer sample is necessary, i.e., first sample which node to walk, then sample a certain edge on that node.

A tricky part is the implementation of node2vec. A naive implementation would incur huge amount of network bandwidth and local memory consumption to compute the transition probability, if the adjacency list of the previous vertex needs to be pulled from other nodes. But thanks to Gemini's edge placement policy, where each undirected edge is stored 4 times on 2 different nodes, there is a much more efficient implementation method: For a current vertex v, previous vertex u, and an edge (v, x), if there is an undirected edge between u and x, then that edge must can be found at the same node where (v, x) is, so there is no need to copy entire adjacency lists between nodes.

lushl9301 commented 4 years ago

Thanks for the detailed reply.

I was trying to use Gemini for another algorithm, sort of like Personalized pagerank. i was wondering if I could take your implementations as reference.

Yes. your approaches explained here sounds valid to me.. Thanks for that :)

Best Regards

ykwd commented 4 years ago

You are welcome!