Weird training accuracy curve

PetarV- / GAT

Graph Attention Networks (https://arxiv.org/abs/1710.10903)

https://petar-v.com/GAT/

MIT License

3.18k stars 642 forks source link

Weird training accuracy curve #7

Closed gongliyu closed 6 years ago

gongliyu commented 6 years ago

Hi,

I observed that the training accuracy is very low all the time during the training process. The values is around 50%. Although validation accuracy reaches around 83%, this is still weird. I tried Kipf's GCN code, it can reach above 90% training accuracy, which is much normal.

Does anybody notice this?

PetarV- commented 6 years ago

Hi Liyu,

This is caused by the strong dropout on the alpha_ij values. It causes the network to mispredict more often during training, as the graph neighbourhoods get quite variable; however this makes it less prone to overfitting on small training datasets such as Cora.

(If you were to turn off dropout and evaluate the model on the training set at the end of each epoch, you would probably get much more normal progression of training accuracies.)

Hope that helps!

Thanks, Petar

gongliyu commented 6 years ago

Hi Petar,

Got it. Thank you very much for the explanation.

Best, Liyu