Hyper-parameter search for models

shchur / gnn-benchmark

Framework for evaluating Graph Neural Network models on semi-supervised node classification task

https://arxiv.org/abs/1811.05868

MIT License

438 stars 73 forks source link

Hyper-parameter search for models #3

Closed soumyasanyal closed 5 years ago

soumyasanyal commented 5 years ago

Hi!

Thanks for the insightful work! I've some queries regarding the way you've selected the hyper-parameters. In paper, you write that "For every model, we picked the hyperparameter configuration that achieved the best average accuracy on Cora and CiteSeer datasets (averaged over 100 train/validation/test splits and 20 random initializations for each)". So, for a given seed, you calculate the average over 100 splits of the data. But for training each one of the split, do you use same random initializations for the model for all splits or a different one?

shchur commented 5 years ago

I'm not 100% sure if I recall correctly, but I believe that we used the same 20 initializations for each of the splits. However, we found that in most cases the train/val/test split has a far larger effect on the performance than random initialization, so I guess that this choice is not crucial.

soumyasanyal commented 5 years ago

I ask this because if you used different initializations every time, then essentially the experiment boils down to running with 2000 different seeds (100 * 20) for each setup and doing a random split every time.

shchur commented 5 years ago

I just double checked, same initializations are used for all splits.

soumyasanyal commented 5 years ago

Ok, thank you for verification.