seongjunyun / Graph_Transformer_Networks

Graph Transformer Networks (Authors' PyTorch implementation for the NeurIPS 19 paper)
960 stars 179 forks source link

What's the difference between "X_ = torch.cat((X_,X_tmp), dim=1)" and "X_ = torch.cat([X_,X_tmp], dim=1)" #19

Closed why986 closed 3 years ago

why986 commented 4 years ago

At first I use the latter to train my model and get about 88 f-score on test dataset(ACM), then I change it to the former and get about 92 f-score. What's the difference?

why986 commented 4 years ago

Also, when I use the former, I sometimes get nan loss...

seongjunyun commented 3 years ago

Hi, sorry for the late reply.

As I know, there is no difference. The difference in performances is maybe due to random seed. If you get nan loss, I suggest you change the learning rate or the weight decay.

Thank you.