Open 9310gaurav opened 5 years ago
I also noticed this difference. In fact, in the tensorflow version of original paper, the train_adj and val_adj are different. So I think this implementation detail might be missed by the author of pytorch version .
You are right. Also for the default arguments, the aggregator's "gcn" argument is set to False while for encoder it is set to True. That means aggregator does not consider self loops while aggregating. That's why the accuracy numbers are almost the same as stated in paper (they should be higher if test edges are also used during training). I hope the author could fix the two bugs so that other people using the code are aware. Thanks.
Thank you for reminding me of the self loops
@9310gaurav @ShengdingHu Thanks for your comments, and making me aware about this issue. The following is my observations, if I was wrong, please correct me.
In encoders.py. The self_loop was combined when the 'gcn' flag is set to false, while when 'gcn' is set to true, the self_loop is not combined. Therefore, the author set 'gcn' of the aggregator to False, and 'gcn' of the encoder to True, so that the entire model did not consider the self_loop.
So I think the settings in cora.run is correct (the author did it on purpose). However, the author treat the flag 'gcn' in aggregator.py and encoder.py inconsistent and differently.
@HongxuChenUQ I don't think the author did this on purpose. As mentioned in the paper, there are two ways self-loops can be supported in this architecture. First way is the GCN way where you use the self loop during aggregation. Second way is where you use it in the encoder by concatenating the self features with the aggregated neighbour features and then multiply them with the weights. Setting gcn as False in encoder and true in the aggregator implies not using any of these ways. If you set gcn to True in the encoder as well (GCN way), you will get improved accuracy.
@9310gaurav Thanks Gaurav. It makes clear for me now.
I think it is just a simple implementation of the algorithm, and the datasets the author uses here is different from what he used in the paper. The cora and Pumbed datasets are small benchmark datasets and have been used in Thomas Kipf's GCN paper. So by all means, it is transductive learning.
mark
I want to confirm if the implementation is of inductive learning or transductive learning. From the code it looks like Variable adj_lists contain all the edges in the graph and therefore test edges are also used during training. However the results reported in the paper correspond to inductive learning. Please let me know if I am missing anything.