Closed rusty1s closed 2 years ago
Hi @rusty1s , I was about to open a new discussion, and just realized you are already on this. Just commenting to share my interest in this feature. Cheers.
@michalisfrangos are you interested in contributing this feature?
Pinging @mananshah99 and @sdulloor here who shared interest in contributing this feature as well.
It might be useful to implement a utils.subgraph_bipartite(subset:Tuple[torch.Tensor,torch.Tensor],...)
or add support to utils.subgraph
for bipartite graphs. I prefer adding a new function over modifying the existing one to make the code more clean.
That way HeteroData.subgraph()
would make multiple calls to subgraph_bipartite
. Something like
subgraph(node_mask_dict):
....
for edge_type in self.edge_types:
if edge_type[0] in node_mask_dict and..:
new_edge, _ , _ = utils.subgraph_bipartite((node_mask_dict[edge_type[0], node_mask_dict[edge_type[-1]))
WDYT?
Yes, this looks good to me. Although we overload a lot of functionality with bipartite graph support already (by passing tuples instead of single tensors), I agree that adding this directly to subgraph
might makes the code overly complex. bipartite_subgraph
is a good alternative that we do not even have to expose.
how would this be different from just sampling a heterogeneous graph with large neighbourhoods to get different node types in the new sampled bipartite graph?
Not sure I understand. Can you clarify? The subgraph()
method might be useful to gather subgraphs prior to any training or sampling, e.g. for obtaining inductive subgraphs based on a pre-defined split.
so I have a transductive problem (for now) and for heterognn classification I am planning to just use the HGTLoader
to get smaller batches for a list of nodes to train my model. Does that set up seem correct? I'm not sure how/if i should be using something like the subgraph()
method, (whenever its implemented).
It depends on which data you want to train on. If you want to shrink the data prior to training, then HeteroData.subgraph
would be applicable to create a smaller subgraph from your original graph. If you just want to operate on smaller batches during training, then you may want to adjust the batch_size
argument of a loader.
Let me know if that makes sense to you.
🚀 The feature, motivation and pitch
Similar to
Data.subgraph()
, there should exist aHeteroData.subgraph()
method to compute subgraphs in a heterogeneous graph setting, e.g., for obtaining inductive node splits. Here,mask
/index
should be of typedict
, holding masks/indices for each/a subset of node types:Alternatives
No response
Additional context
No response