kimiyoung / planetoid

Semi-supervised learning with graph embeddings
MIT License
869 stars 296 forks source link

Better performance than in paper #2

Closed bkj closed 7 years ago

bkj commented 7 years ago

Running test_trans and test_ind.py on the CITESEER dataset both yield (a few % points) better performance than reported in the paper. Any idea why that would be? Is the implementation here slightly different, or is it just a function of train/test split?

EDIT: Same goes for CORA, but I haven't been able to reproduce PUBMED -- got to ~ 0.64 and then I killed it, so perhaps I didn't let it run long enough. At that point accuracy was increasing very slowly, though.

kimiyoung commented 7 years ago

The difference in performance seems to be due to random seeding. Try different seeds here https://github.com/kimiyoung/planetoid/blob/master/base_model.py#L32 https://github.com/kimiyoung/planetoid/blob/master/base_model.py#L34

The default hyper-parameters are given for Citeseer, and should be tuned for other datasets.

bkj commented 7 years ago

Makes sense, thanks!