Open thu-wangz17 opened 4 years ago
Please have a look at the AMiner
dataset, which introduces the first hete-graph in PyG. In general, we save heterogeneous features and connectivity in x_dict
and edge_index_dict
so that it does not collide with the formulation of homogeneous graphs. Batching is currently not supported though, and I will work on it.
Thanks.That's very helpful and I'm expecting this exciting work!
I try to follow the AMiner
dataset:
import torch
from torch_geometric.data import Data, DataLoader
class HeteroData(Data):
def __inc__(self, key, value):
if key == 'A->A':
return len(self.x_dict['A']['x'])
elif key == 'B->B':
return len(self.x_dict['B']['x'])
elif key == 'A->B':
return torch.tensor([len(self.x_dict['A']['x']), len(self.x_dict['B']['x'])])
elif key == 'B->A':
return torch.tensor([len(self.x_dict['B']['x']), len(self.x_dict['A']['x'])])
else:
return 0
data_list = []
for i in range(5):
node_types = {'A': {'x': torch.randn(3, 3)}, 'B': {'x': torch.randn(2, 3)}}
edge_types = {
('A', 'A->A', 'A'): {
'edge_index': torch.tensor([[0, 1, 1, 2], [1, 0, 2, 1]]),
},
('A', 'A->B', 'B'): {
'edge_index': torch.tensor([[0, 1, 2], [0, 1, 0]]),
},
('B', 'B->A', 'A'): {
'edge_index': torch.tensor([[0, 1, 0], [0, 1, 2]]),
},
('B', 'B->B', 'B'): {
'edge_index': torch.tensor([[0, 1], [1, 0]]),
},
}
data = HeteroData()
data.edge_index_dict = edge_types
data.x_dict = node_types
data_list.append(data)
dataloder = DataLoader(data_list, batch_size=2)
for i in dataloder:
print(i.edge_index_dict)
break
It returns
[{('A',
'A->A',
'A'): {'edge_index': tensor([[0, 1, 1, 2],
[1, 0, 2, 1]])},
('A',
'A->B',
'B'): {'edge_index': tensor([[0, 1, 2],
[0, 1, 0]])},
('B',
'B->A',
'A'): {'edge_index': tensor([[0, 1, 0],
[0, 1, 2]])},
('B',
'B->B',
'B'): {'edge_index': tensor([[0, 1],
[1, 0]])}},
{('A',
'A->A',
'A'): {'edge_index': tensor([[0, 1, 1, 2],
[1, 0, 2, 1]])},
('A',
'A->B',
'B'): {'edge_index': tensor([[0, 1, 2],
[0, 1, 0]])},
('B',
'B->A',
'A'): {'edge_index': tensor([[0, 1, 0],
[0, 1, 2]])},
('B',
'B->B',
'B'): {'edge_index': tensor([[0, 1],
[1, 0]])}}]
If I want to batch correctly,it seems that I need to rewrite Batch
class?
In my opinion,each kind of meta-paths in a heterograph could be looked as a homograph.Thus I should rewrite the Batch
class to make the same meta-path in different heterographs batch to form a large unlinked graph.Is that right?Or is there another approach to handle with this problem?
@sakuraiiiii Hi, I meet the same problem. Have you batched the data correctly?
Hi,I have a question about how to construct a heterograph.For example,there are two kinds of nodes,A and B,with some features describing them.I know that to implement heterograph requires to rewrite
__inc__
methods inData
class.However after that howPyG
distinguishes different nodes sinceData.x
only takes in atensor
rather than adict
?I find thetest_hete.py
file inhete_conv
brach.I think the implement is very clear and I want to batch the data in this form:
But the above code is wrong because the
Data.x
andData.edge_index
only taketensor
.Could you give me an example about how to construct the heterograph?Thank you very much.