seongjunyun / Graph_Transformer_Networks

Graph Transformer Networks (Authors' PyTorch implementation for the NeurIPS 19 paper)
960 stars 179 forks source link

The attention score in model_sparse #10

Closed ty4b112 closed 4 years ago

ty4b112 commented 4 years ago

To find meta-paths with high attention scores learnt by GTNs, I print the attention scores in main.py (denoted as Ws in line 100) and mainsparse.py (denoted as in line 127). I run your code with: "python main_sparse.py --dataset IMDB --num_layers 3 --adaptive_lr true". Surprisingly, it seems that the model did not train the weight of each GTConv at all, the weights after softmax are always [0.2, 0.2, 0.2, 0.2, 0.2].

seongjunyun commented 4 years ago

Hi, We ran our code by the same command you did and upload the log here. image

The weight of each GTConv kept changing during the training process in this log. If you modified something in our code, then please tell me about it.

ty4b112 commented 4 years ago

Hi, I clone the latest version of your code and do not change any of your code except add "print(_)“ after line 119 in main_sparse.py, but the results are still the same. My dependencies are as follows: CUDA Version 10.1.168 python 3.7.4 torch 1.4.0
torch-geometric 1.4.2
torch-scatter 2.0.4
torch-sparse 0.5.1 图片

changym3 commented 4 years ago

Hi, I clone the latest version of your code and do not change any of your code except add "print(_)“ after line 119 in main_sparse.py, but the results are still the same.

Hey! I meet this problem too. And I find that the tensors returned from spspmm will have the value None in grad_fn. In fact it's due to the version of the torch_sparse. The package has removed the autograd support of the function spspmm for speed. Detail here. You could just reinstall torch_sparse below 0.4.4 release. You could get the correct attention score.