Confused about unsupervised node classification on cora/citeseer

Shen-Lab / GraphCL

[NeurIPS 2020] "Graph Contrastive Learning with Augmentations" by Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, Yang Shen

MIT License

547 stars 103 forks source link

Confused about unsupervised node classification on cora/citeseer #16

Closed hengruizhang98 closed 3 years ago

hengruizhang98 commented 3 years ago

Dear authors, It seems that you still follow a local-global contrastive manner when applying the model to node classification tasks(it's similar to 'mvgrl' but adopts different data augmentations). Is there any reason for not directly computing the similarities of nodes' representations within a batch and applying the NT-XENT loss?

yyou1996 commented 3 years ago

Hi @hengruizhang98,

Thanks for your interest in our work, @yongduosui would you mind commenting on this? Thanks.

yongduosui commented 3 years ago

Please check paper DEEP GRAPH INFOMAX[1], this paper maximizing mutual information between patch representations and corresponding high-level summaries of graphs. We also add augmentation graphs information to maximizing mutual information, which is equal to optimize GraphCL loss. You can check and compare the theoretical proof in our paper Appendix section with the paper DEEP GRAPH INFOMAX[1] for more details.

[1] Petar Veliˇckovi´c, William Fedus, William L Hamilton, Pietro Liò, Yoshua Bengio, and R Devon Hjelm. Deep graph infomax. arXiv preprint arXiv:1809.10341, 2018.

hengruizhang98 commented 3 years ago

I get what you mean and I know that the local-global contrastive manner is effective for self-supervised learning(e.g DIM and DGI). However, your model borrows the idea of SimCLR, which directly contrast the two augmentations of a same image(global to global). So I think when applying your model to node classifcation tasks, you still have to contraste the two representations of the same node(local to local), otherwise your model is just the same with mvgrl[1].

[1]Hassani and Khasahmadi, Contrastive Multi-View Representation Learning on Graphs. in ICML 2020.

yyou1996 commented 3 years ago

@hengruizhang98 This is a good point. My argument is that, actually we treat GraphCL as a general framework to unify a broad family of CL methods (details see GraphCL supplement section F) that CL views are built with various augmentations. Thus, global-local is a special example of none-subgraph augmentation pairs covered by GraphCL. The reason we implement the last node classification experiment based on DGI is that we hope to start from an effective backbone to facilitate our development. Hope my answer help, thanks!

hengruizhang98 commented 3 years ago

Yes, I agree with you. I have no questions now. Thanks for your response!