Closed zzzzzero closed 1 year ago
Hi, for Mini-ImageNet without k-NN graph, we use the following hyper-parameters for NodeFormer (different from the ones with k-NN graph):
python main.py --dataset mini --metric acc --rand_split --method nodeformer --lr 0.01\
--weight_decay 5e-3 --num_layers 2 --hidden_channels 64 --num_heads 6\
--rb_order 0 --rb_trans sigmoid --lamda 0 --M 30 --K 10 --use_bn --use_residual --use_gumbel\
--runs 5 --epochs 250 --device 1
You are expected to achieve the score in our paper. Also notice that due to the different randomness results determined by computational machines, the achieved scores can exhibit more or less difference when using different GPUs. For the k-NN graph case, I think it is natural to achieve higher score than us even using the same hyper-parameters if you are using different machines. We use NVIDIA V100 and RTX 2080Ti for our experiments, where the relative performances are consistent
Thanks for your help! After I change the hyper-parameters for Mini-ImageNet without k-NN graph ,I got an accuracy of 87.76± 0.75%.
Thank you for your great work. I've noticed that when the input k-NN graph is not used, Nodeformer can yield superior results on Mini-ImageNet. This suggests that the k-NN graphs are not necessarily informative and besides, Nodeformer learns useful latent graph structures from data. I tried to train Nodeformer on Mini-ImageNet without input k-NN graph, removing both the edge regularization and relational bias, but I only achieved an accuracy of 83.76 ± 0.78%.When training Nodeformer on Mini-ImageNet with a k-NN graph (k=5), I was able to achieve an accuracy of 87.01 ± 0.45%.I wonder if I miss some important details during the training process.