Cannot Reproduce Result of ZINC

b05901024 commented 2 years ago

Hi, I trained some models on PCQM4M-LSC, ogbg-molhiv, and ZINC following the setting in the paper, and the results of PCQM4M-LSC and ogbg-molhiv are same as the paper. I also run experiment on ZINC several times, but the MAE is always more than 0.14 (with or without adding --intput_dropout_rate 0), which should be about 0.12 according to the paper. Here is my command:

python3 entry.py --dataset_name ZINC --hidden_dim 80 --ffn_dim 80 --num_heads 8 --tot_updates 400000 --batch_size 256 --warmup_updates 40000 --precision 16 --intput_dropout_rate 0 --gradient_clip_val 5 --num_workers 8 --gpus 1 --accelerator ddp --max_epochs 10000

zhengsx commented 2 years ago

Would you please kindly provide the detailed python environment info to let us reproduce your problem? Btw, is the 0.14 MAE reported on validation set or test set?

b05901024 commented 2 years ago

Thanks for your reply, 0.14 MAE is on validation set. I'm using python 3.8.8, and below are the packages in my environment.

absl-py 0.14.1 aiohttp 3.7.4.post0 async-timeout 3.0.1 attrs 21.2.0 cachetools 4.2.4 certifi 2021.5.30 chardet 4.0.0 charset-normalizer 2.0.6 Cython 0.29.24 einops 0.3.2 fsspec 2021.10.0 future 0.18.2 google-auth 1.35.0 google-auth-oauthlib 0.4.6 googledrivedownloader 0.4 grpcio 1.41.0 idna 3.2 isodate 0.6.0 Jinja2 3.0.2 joblib 1.0.1 littleutils 0.2.2 Markdown 3.3.4 MarkupSafe 2.0.1 multidict 5.2.0 networkx 2.6.3 numpy 1.21.2 oauthlib 3.1.1 ogb 1.3.2 outdated 0.2.1 packaging 21.0 pandas 1.3.3 Pillow 8.3.2 pip 21.0.1 protobuf 3.18.0 pyasn1 0.4.8 pyasn1-modules 0.2.8 pyDeprecate 0.3.1 pyparsing 2.4.7 python-dateutil 2.8.2 pytorch-lightning 1.4.9 pytz 2021.3 PyYAML 5.4.1 rdflib 6.0.1 rdkit-pypi 2021.3.5.1 requests 2.26.0 requests-oauthlib 1.3.0 rsa 4.7.2 scikit-learn 1.0 scipy 1.7.1 setuptools 58.0.4 six 1.16.0 tensorboard 2.6.0 tensorboard-data-server 0.6.1 tensorboard-plugin-wit 1.8.0 threadpoolctl 3.0.0 torch 1.9.1+cu111 torch-cluster 1.5.9 torch-geometric 2.0.1 torch-scatter 2.0.8 torch-sparse 0.6.12 torch-spline-conv 1.2.1 torchaudio 0.9.1 torchmetrics 0.5.1 torchvision 0.10.1+cu111 tqdm 4.62.3 typing-extensions 3.10.0.2 urllib3 1.26.7 Werkzeug 2.0.1 wheel 0.37.0 yacs 0.1.8 yarl 1.7.0

zhengsx commented 2 years ago

Would you like to test your trained model on ZINC test set (according to the best valid MAE)? The MAE of 0.122 reported in the paper is evaluated on test set, where the valid MAE is about 0.14+ (we have not reported valid MAE in the paper).

b05901024 commented 2 years ago

It seems that there is not problem for reproducing, but I still want to know why there are some difference between the result of validation and test set.

zhengsx commented 2 years ago

Glad to see your reproduction. The gap is caused by the distribution of valid and test set of ZINC is not strictly same.

zhengsx commented 2 years ago

Close this issue due to inactive. Feel free to raise a new one or reopen this one for any further question.

microsoft / Graphormer

Cannot Reproduce Result of ZINC #35