Open kkranen opened 1 year ago
Thanks for the issue @kkranen. I think this should be straightforward to get in, since all we need to do is mask/index select edge_index
and any edge features, e.g.:
edge_index = edge_index[:, mask]
edge_attr = edge_attr[mask]
We can either integrate this as a utility function or in Data
/HeteroData
directly. What do you prefer?
I think it's likely better as a member function of Data/HeteroData. On a slightly other note, it may also be good to split any other user-included features other than edge_attr (like categorical data). I know that this is slightly dangerous in the sense that other attributes are not guaranteed to be Tensors, but what I have as a fix on my side looks like this:
def edge_subgraph_pyg(graph, edge_ids):
"""
graph: Target graph to create edge subgraph of
edge_ids: Edge ids (indices or boolean tensor) to use for the subgraph
"""
output = copy.deepcopy(graph)
for edge, data in edge_ids.items():
for name in output[edge].keys():
if isinstance(output[edge][name],Tensor):
output[edge][name] = output[edge][name][...,edge_ids[edge]]
return output
We usually do this in PyG via the data.is_edge_attr
function, so it would become something like:
def edge_subgraph(self, mask):
out = copy.copy(self)
for key, value in self.items():
if self.is_edge_attr(key):
out[key] = self[key][mask]
return out
🚀 The feature, motivation and pitch
I'm working on use cases where I would like my train/valid/test dataloaders to have a different set of available edges base don what has already been seen (ie, train edges are seen in validation set, validation edges and train edges are seen in test set).
In order to do this, I need to have support for inducing edge subgraphs, similar to DGL's edge subgraph: https://docs.dgl.ai/en/0.9.x/generated/dgl.edge_subgraph.html#dgl.edge_subgraph
Alternatives
No response
Additional context
No response