Closed nanyoullm closed 2 years ago
Hi, Thanks for your interest in GraphLearn!
Here is the overview of GraphLearn system modules.
The folders and their corresponding modules and functions are as follows.
Storage: graphlearn/core/graph, which is GraphLearn's distributed in-memory graph storage, where storage is local in-memory graph storage, include NodeStorage
and EdgeStorage
, when EdgeStorage
is stored as adjacency table; Graph
and Noder
encapsulate the graph storage. GraphStore
is the access portal for all Graph
s and Noder
s of the whole graph.
Operator: graphlearn/core/operator, is GraphLearn's graph operator implementation, including graph traversal, sampling, negative sampling, graph loading, etc. Operator instances are managed and created through OpFactory
. Operator access Local and remote graph data through GraphStore
's interface , input as OpRequest
, execution result encapsulated as OpResponse
.
Runner: graphlearn/core/runner, a distributed execution runtime of GraphLearn. DagScheduler
schedules operators on the graph in topological order, executes operators concurrently between operators, and between multiple sampled iterations. DagNodeRunner
constructs Operator , OpRequest, OpResponse, and call operator execution.
Dag: graphlearn/core/dag, GraphLearn's graph sampling interface expressed through Graph Sampling Language(GSL), a GSL Query contains multiple sampling operators,Dag
is the logical execution plan of the Query. DagNode
is the sampling operator, and DagEdge
is the input-output relationship between the operators.
Partitioner: graphlearn/core/partition, is GraphLearn's distributed graph partitioning module, GraphLearn divides graph data into multiple GraphLearn Servers according to the strategy of Partitioner
. When the data in an OpRequest is distributed in multiple GraphLearn Servers, the request is divided by Partitioner, executed by sending it to multiple Servers, receiving the results of partitions and then merge through Stitcher
.
Service: graphlearn/service. rpc access between each Server of GraphLearn, Service
is the rpc base class of Protobuf framework, ServiceImpl
is its subclass, containing a number of specific implementations of Handler
; Server side starts through grpc; Executor
is a specific response to a request execution body, through which the Service framework and functional modules isolated; ChannelManager
is responsible for the creation and management of the Channel
. Client is the initiator of the request, including InMemoryClient to initiate local requests, RpcClient (o initiate remote requests. Tensor
is GraphLearn storage and transmission of data structure, the underlying implementation is protobuf.
想学习源码,请问应该从哪里看起。 1、目前我对应着论文里的storage、sampling、operator几个部分在core里找,但始终难以串起来; 2、另外分布式相关的也不知道怎么看,比如storage如何把图数据在分布式环境中存储; 官方的文档感觉和论文难以关联起来,分布式的也只看到了一个k8s训练的例子。对于源码学习这块,有人可以指导一下吗
可以交流一下~
想学习源码,请问应该从哪里看起。 1、目前我对应着论文里的storage、sampling、operator几个部分在core里找,但始终难以串起来; 2、另外分布式相关的也不知道怎么看,比如storage如何把图数据在分布式环境中存储; 官方的文档感觉和论文难以关联起来,分布式的也只看到了一个k8s训练的例子。对于源码学习这块,有人可以指导一下吗