Issue with creation of MFG

Yash685 commented 3 months ago

Consider the MFG present in above image with seed node as node 10 (Nodes colored blue are renumbered nodes and red ones are original ID's). For now forget the expansion that has been done . Now according to neighbor_sample.cu (Neighbor sampling present in CxDNN-DL) we will be doing sampling the expanding the neighborhood at each layer. According to sampling logic present (in function sample) we loop through seed nodes present in line 159 and call expand operation. This expand function is called for each 'i' in range (seed_begin, seed_end). Now consider case of layer 2 were expansion/sampling has to happen, it will be happening in range 4-11. Since we repeated node 10 (which was seed node or can say node found earlier) and it's id is not in that range , the expansion will not happen for that node. So is this logically correct with respect to GNN and also how training would be happening (since training happens in bottom to up fashion) ?

UtkrishtP commented 3 months ago

@xxcclong Basically we want to understand that why is node 10(or any other node) when re-appearing in the MFG not being added as part of the CSR format. (batch.x & batch.y) Doesn't this make GNN logically incorrect or are we missing something? Also how are we handling training in this context, assuming we are not expanding the repeated nodes till the last layer?

xxcclong commented 3 months ago

Hi, thank you for your interest in our work. Since we are all using undirected graph (which is also common practice in GNN training), therefore, in the graph structure, the parent node will always be a neighbor of the child node. In the subgraph, not including the parent node as a neighbor of the child node is to prevent oscillating message transmission between the parent and child nodes. I would like to know what you think is the necessity of including the parent node.

Yash685 commented 3 months ago

@xxcclong Screenshot from 2024-05-23 14-20-24

Thank you for explanation. Above is snapshot of code present in expand function of neighbor_sample.cu.

We want to clarify two cases as shared below: Case 1:
In the diagram shared by us, take the L1 node 11. Suppose node 11 is satisfying the first if condition fanouts >= num_neighbors, then we are taking all the nodes including the parent node (10). Correct us if we are wrong.

Case 2: Carrying on with the same example, suppose node 11 has num_neighbors> fanouts, how are we guaranteeing that it's parent node 10 wont be sampled again?

xxcclong commented 3 months ago

The parent node (10), can be the neighbor node of 11, but it won't be expanded again. The neighbor of 10 when performing graph convolution will always be 15, 11, and 12. The two cases are consistent about this.

UtkrishtP commented 3 months ago

@xxcclong Thanks for explanation. So in the above case, how will we train using the bottom-up approach? Suppose node 10 reappears in layer 2 and we are not expanding it's final layer. While training, how are we message passing for node 10 in layer 2 given it has no layer 3 neighbors, are we fetching it's node features or caclulating it's node embedding somehow? We are having a hard time visualizing the above process, doesn't it conflict with the message passing framework, or are we missing something here?

xxcclong commented 2 months ago

The neighbor of node 10 will always be 15, 11, and 12. The expansion at layer 2 will also regard these three as node 10's neighbors.

xxcclong / history-cache

Issue with creation of MFG #4