Issue with reproducing the results: variance in accuracy

samiracs87 commented 4 years ago

Hi there, I tried to run your code and model, but I have not been able to get the results which is shown in the simple_example_tensorflow.ipynb using the same seed value used there. Does the seed, set in idx_split_args, fix the data set split for training, early stopping and validation?Fixing the same seed value in idx_split_args, is there any other components which might cause the variance in accuracy (apart from the randomized components in training such as dropout and weight initialization)?

Thanks!

gasteigerjo commented 4 years ago

As with any GNN I've seen the performance of PPNP has a rather large variance. Hence it is very unlikely that you will get the exact same results reported in simple_example_tensorflow.ipynb.

Have a look at reproduce_results.ipynb instead, where we run the model 100x with varying seeds to get statistically significant results. Any proper model evaluation should do this.

Yes, the seed fixes all parts of the data split, but only that. It does not affect model initialization. This repository is provided so you can check any hypothesis you make. Just look through the code. :)

samiracs87 commented 4 years ago

Thanks for the helpful response! Sure, I'll look into the code and will do more experiments. BTW, nice work! :)

gasteigerjo / ppnp

Issue with reproducing the results: variance in accuracy #7