I have some queries regarding the way you've split the data into train/valid/test. In paper, you write that " In all cases, we use 20 labeled nodes per class as the training set, 30 nodes per class as the validation set, and the rest as the test set". Taking Pubmed data for example, where the total nodes are 19717 and number of classes are 3, would the split contain train=60, valid=90 and test=19567 nodes? The split looks very skewed compared to the original split (train=60, valid=500, test=1000) and hence the confusion.
Since Pubmed has >19K nodes, it seems strange / arbitrary to only use 1K nodes as the test set and ignore other 18K nodes that are already in the graph.
Our main goal is not to obtain results that are exactly comparable with the Planetoid splits. Instead, we wanted to perform a fair / standardized evaluation using multiple different datasets. Therefore, we picked a train/val/test splitting strategy that we believe is reasonable for all the considered datasets. Of course, based on your use case / availability of labeled data, you may pick a different strategy, which may lead to different results.
Hi!
I have some queries regarding the way you've split the data into train/valid/test. In paper, you write that " In all cases, we use 20 labeled nodes per class as the training set, 30 nodes per class as the validation set, and the rest as the test set". Taking Pubmed data for example, where the total nodes are 19717 and number of classes are 3, would the split contain train=60, valid=90 and test=19567 nodes? The split looks very skewed compared to the original split (train=60, valid=500, test=1000) and hence the confusion.