daiquocnguyen / Graph-Transformer

Universal Graph Transformer Self-Attention Networks (TheWebConf WWW 2022) (Pytorch and Tensorflow)
Apache License 2.0
651 stars 79 forks source link

Results decrease after shuffling the dataset #11

Open ChenAris opened 3 years ago

ChenAris commented 3 years ago

Hi,

I have a question about the result. I run the code (UGformerV1_PyTorch/train_UGformerV1_UnSup.py) with shuffled dataset, and the result decreases sharply compared to the dataset without shuffling (Please correct me if I run it wrongly and the result remains the same with shuffling). I wonder what the reason is...

I found that the graph order, if the dataset is not shuffled, is strongly related to the graph labels in the original dataset (e.g., the former half of the dataset have label 0), so is the global node id. But I don't know where the model (Transformer or SampledSoftmax) uses the global node id information...

Thanks

podismine commented 3 years ago

I have found the same problem. I think it is caused by the sampled softmax, which is a biased estimate. The embedded features are in a normal distribution. The nearby features have little differences and could get a fine result with the labels in order.