Closed peihaowang closed 3 years ago
Could you provide detailed logs and training scripts used in your all experiments? Besides, I notice several places in your description that may lead to different reproduction results. Thereby, I list some bullets here and recommend a double check for them in your reproduction:
Thanks to your reply, we manage to reproduce your results on ZINC dataset. But unfortunately, not on PCBA or MolHiv. I'm wondering if the checkpoint of the pretrained model could be provided, since it might become the key factor hindering our reproduction. (we strictly follow the hyperparameters and training recipe provided in your paper and reproduction process) BTW, I also hope to listen to the authors' points towards if Graphormer can be directly adopted to other domain datasets like social network. Or it only applies to molecular data.
We're willing to offer help if you strictly follow all instructions and still fail for reproduction. Please provide the detailed logs, training scripts and python environments used in your experiments for PCBA and MolHiv. Please make sure that you have resolved all potential problems listed in my previous comment.
Close this issue due to inactivity for a long time. Feel free to reopen it if the problem still exist.
I'm trying to reproduce the reported results on OGB and ZINC datasets, but I failed to achieve the performance.
I first directly run the provided scripts
hiv.sh
to train a graphormer on MolHiv dataset without pretraining. The final AUC is 73.10%. Then I followed the instructions and hyper-parameter settings in the paper to do pre-training. I pre-trained on the PCQM4M for 20 epochs (until the loss converge) and fine-tuned the model on MolHiv for 8 epochs (as specified in the script) The best result turn out to be 76.25%.Despite some improvement, the final AUC is not as high as it was reported in the paper. I also tried to reproduce the result on ZINC via the example script. But the best MAE is 0.1576, which is lower than 0.122 reported in the paper.
I'm wondering what I'm likely to miss that results in my poor performance. Can I know more reproduction details? My python environment is elaborated as below:
I'd really appreciate it if someone could share their reproduced results and give me some suggestions.