shchur / gnn-benchmark

Framework for evaluating Graph Neural Network models on semi-supervised node classification task
https://arxiv.org/abs/1811.05868
MIT License
438 stars 73 forks source link

fixed initializer #2

Open ahu-WLL opened 5 years ago

ahu-WLL commented 5 years ago

Hello, author, First of all, thank you very much for your work! The training parameters are randomly initialized in your code. But now, I need a fixed initialization, what should I do?

shchur commented 5 years ago

I'm not sure what exactly your use case is, and I'm also not sure if there is an elegant way to do that. One hacky approach that should definitely work though is by changing the line https://github.com/shchur/gnn-benchmark/blob/master/gnnbench/run_single_job.py#L123 to tf.set_random_seed(123)

ahu-WLL commented 5 years ago

Thanks for your reply, In your code, two experiments can't get the same result under the same parameter setting. so I think it is caused by different initialization. I used the method you said, but I still can't get the same result. I want to get the same experimental result under the same parameter setting. can you help me?

shchur commented 5 years ago

Can you please describe what exactly is the experiment that you are trying to do? What are the commands that you are running, and in what cases do you expect to get the same output from the model?

ahu-WLL commented 5 years ago

I run your code(one split, one initialization ) on GCN model in Cora dataset twice, I used the default parameters. I can be sure that the data split and parameter setting are the same between the two experiments, But I got different result. If I want to get the same result, what should I do?

shchur commented 5 years ago

This is not really the use case that we had in mind for this framework, but it should be possible to get it done.

Can you list the commands that you are running & paste the contents of all the YAML config file that you are using?

There are a few things that might be at play here:

  1. TensorFlow behavior on GPU might not be deterministic. However, this should only lead to very minor changes in performance, and it should still be possible to get the same initialization for the weight matrices.
  2. If you say num_inits: 2 in your YAML config file, the model will be run twice using two different initializations. Even though the random seed is fixed, the model will actually call sess.run(init_op) twice, which will produce two different initializations. A simple example: even if you have a fixed random seed, if you call np.random.rand() twice in one script, you will get two different numbers. However, if you rerun the script, the first value that you get will always be the same.

The proper way to achieve the behavior that you want is by setting num_inits: 1, and adding the same experiment to the DB twice using create_jobs.py.

One way to find the source of the problem, you can try calling tf.trainable_variables(), evaluating them right after the initialization using sess.run() and saving them to disk using np.save(). You should add this code here. Then you can inspect the matrices saved to the disk. If they are the same in both cases, it means that your different results are due-to non-determinism of TensorFlow. If the weight matrices are different, it means that the TF random seed is not set properly and we can look into it further.