Shen-Lab / GraphCL

[NeurIPS 2020] "Graph Contrastive Learning with Augmentations" by Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, Yang Shen
MIT License
547 stars 103 forks source link

bugs report #4

Closed flyingtango closed 3 years ago

flyingtango commented 3 years ago

Hi Yuning,

There are some errors in the code.

When I run mask and edge in unsupervised_Cora_Citeseer using python -u execute.py --dataset citeseer --aug_type mask --drop_percent 0.20 --seed 39 --save_name cite_best_dgi.pkl --gpu 0, there will have unassignment error as follows:

Traceback (most recent call last): File "execute.py", line 189, in <module> sparse, None, None, None, aug_type=aug_type) File "/anaconda3/envs/graph/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__ result = self.forward(*input, **kwargs) File "/GraphCL/unsupervised_Cora_Citeseer/models/dgi.py", line 51, in forward ret1 = self.disc(c_1, h_0, h_2, samp_bias1, samp_bias2) UnboundLocalError: local variable 'c_1' referenced before assignment

Here are my env version: python==3.7, torch==1.5.0, torch-geometric==1.5.0. I hope you will notice and fix them.

yongduosui commented 3 years ago

I have fixed this bug just now, you can try again.

flyingtango commented 3 years ago

Thx! In addition I have some questions and suggestions. Q1: When I run ./go.sh 0 NCI1 random2 or ./go.sh 0 NCI1 random3, it will show nan loss in the unsupervised_TU

+ for seed in 0 1 2 3 4
+ CUDA_VISIBLE_DEVICES=0
+ python gsimclr.py --DS NCI1 --lr 0.01 --local --num-gc-layers 3 --aug random2 --seed 0
4110
37
================
lr: 0.01
num_features: 37
hidden_dim: 32
num_gc_layers: 3
================
tensor(nan, device='cuda:0', grad_fn=<NegBackward>)
tensor(nan, device='cuda:0', grad_fn=<NegBackward>)

However, when I run ./go.sh 0 NCI1 random4, it will back to normal. I would like to know why this is happening?

Q2: How is GraphCL loss implemented in unsupervised_Cora_Citeseer? As I cannot find the simliarity between c_1 and c_2 ?

Q3: what is shuf_fts used for? and what is the function of h_0, h_2 in Discriminator, since in Discriminator2, there is no c_x = c_x.expand_as(h_pl).

I hope you can give a more detailed comment, thank you!

S1: For unsupervised_TU, downloading TU dataset is working fine, however, in semisupervised_TU, auto downloading dataset does not work properly. I found out that it was due to a problem with the version of the torch-geometric used, which worked fine in version 1.1.0 and did not download correctly in version 1.5.0. Which is the #1 problem.

S2: The installation of cortex-DIM is omitted in the required env yaml. Since cortex-DIM folder is only exist in simsupervised_TU/finetuning.

yyou1996 commented 3 years ago

Hi @flyingtango,

Thanks for detailed feedback. I will try to double check things within this week.

yyou1996 commented 3 years ago

Hi @flyingtango,

Q1. I fixed the bugs. Seems the dropping node ratio was incorrect previously which stands out in random2 (only sample from dropping nodes & subgraph) compared with in random4 --> value overflow --> nan values and gradients.

Q2. @yongduosui can you give some comments on this?

Q3. Would you mind referring the position of the code? Since the implementation is division of labour I would like to find the right person to address the question.

S1. Yes that's right. Since we did experiments in a variety of settings, thus I might refer to the SOTA in each setting first (see the acknowledge part in each exp) --> then I start to implement our version --> thus the environment of each exp are separated. I am sorry that make the inconvenience.

S2. Sorry for the mistake. It should exist in unsupervised_TU dir rather that semisupervised_TU dir. I already made it in the right place.

yongduosui commented 3 years ago

@flyingtango Q2. Please check paper DEEP GRAPH INFOMAX[1], this paper maximizing mutual information between patch representations and corresponding high-level summaries of graphs. We also add augmentation graphs information to maximizing mutual information, which is equal to optimize GraphCL loss. You can check and compare the theoretical proof in our paper Appendix section with the paper DEEP GRAPH INFOMAX[1] for more details.

[1] Petar Veliˇckovi´c, William Fedus, William L Hamilton, Pietro Liò, Yoshua Bengio, and R Devon Hjelm. Deep graph infomax. arXiv preprint arXiv:1809.10341, 2018.

ha-lins commented 3 years ago

Hi @yyou1996,

Thanks for your bug fix! I also met the Q1 previously.

Now after the update, is the ratio of dropped nodes 20%? It seems that the ratio of remained nodes is 20% by mistake previously.

yyou1996 commented 3 years ago

@ha-lins Yes the augmentation ratio is the dropping ratio rather than remained ratio.