AntonioLonga / PytorchGeometricTutorial

Pytorch Geometric Tutorials
1.05k stars 307 forks source link

Turtorial 1 - Cora dataset: unable to understand the masks, and how their counts add up #15

Open mriganktiwari opened 1 year ago

mriganktiwari commented 1 year ago

Hi Antonio or anyone else,

I am trying to see what do the counts 140, 500, 1000 mean for train, val and test masks respectively. torch.sum(data.train_mask), torch.sum(data.val_mask), torch.sum(data.test_mask), data

Gives me this result:

(tensor(140, device='cuda:0'),
 tensor(500, device='cuda:0'),
 tensor(1000, device='cuda:0'),
 Data(x=[2708, 1433], edge_index=[2, 10556], y=[2708], train_mask=[2708], val_mask=[2708], test_mask=[2708]))

Question 1: Does this imply, for train only 140 nodes are available whereas for val and test a lot more as per the split in this data? I am coming from non-GNN background therefore this caught my eye. Question 2: The addition of 140+500+1000 = 1640, does not add upto 2708, should it not?