Open hehuanma opened 2 years ago
Good question. While we have not tested Graphormer on MoleculeNet by training from scratch, the unsatisfactory performance is in expectation. Graphormer is built upon a standard Transformer model, which has very powerful expresiveness. This would be valuable for more challenging large-scale dataset, but will hurt the performance on small benchmark due to the crazy overfitting. Just imagine training ViT or SWin on MNIST or Cifar10 (although Transformer-based models already have been the de-facto standard on image processing).
If someone insist to get a good performance on those extremely small datasets such as MoleculeNet, e.g., less than 100K molecules, here is some tips which may be helpful:
Thank you for the information! That makes sense, we did observe crazy overfitting for some datasets, and for others the training was quite unstable. Btw, do you plan to upload the pretrained model used in the paper? Thus we can apply it and save some computational costs. Thanks!
In our latest plan, all the pre-trained checkpoint models will be released together with the new efficient framework of Graphormer in the next release. Please stay tuned.
Sounds great! Thank you!
Hello, I am trying to use Graphormer on other commonly used datasets from MoleculeNet (https://moleculenet.org/datasets-1) to check the performance, such as BACE, BBBP, etc. I have used the default hparams in the script of molhiv, but the results are horrible... 1) May I know have you tried your model on these datasets without pretrained model? And do you have any suggestions on the hparams for these datasets if we want to train from scratch? I am trying to find out why the results are so bad... 2) For molhiv without pretrained model, I have tried with the provided script in the examples folder, with not adding the "checkpoint_path" argument, and train for 100 epochs. But the best val score is only around 0.763 and the corresponding test score is only 0.636... I don't know what goes wrong... May I know have you tried to use Graphormer directly on molhiv without pretrained model? How is the performance? Thank you.