Open mfbalin opened 4 months ago
@frozenbugs Do you think we can perform the separation of different etypes into different sampled csc inside the dgl blocks conversion function, so that our sampling code does not need any loops over etypes when sampling?
We can use a batched representation overall throughout sampling code and only convert into dictionaries when absolutely needed.
🔨Work Item
IMPORTANT:
Project tracker: https://github.com/orgs/dmlc/projects/2
Description
If we can utilize a single tensor along with offsets to store seeds or sampled edges, that can make our code more performant by avoiding loops over etypes on the Python side.
So that we can almost remove these two functions:
https://github.com/dmlc/dgl/blob/2f585940a80efd39639388dfc206498b3279e58d/python/dgl/graphbolt/impl/fused_csc_sampling_graph.py#L454-L469
We can use
gb.expand_indptr
to perform batched computations on the whole tensor in a single call to first broadcast the elements of the operation and then perform the computation.@frozenbugs @peizhou001