Reproducing benchmark results on the SIDER dataset

icycookies commented 1 year ago

Thanks for your inspiring work. However, we've got problems when fine-tuning MolCLR on the SIDER dataset. It's reported in the paper that MolCLR achieves 68.0 in terms of ROC_AUC under Scaffold split on SIDER, but we fail to reproduce the results. Here's what we've tried:

In our implementation, the average ROC_AUC of MolCLR over 27 tasks using default settings is approximately 62.3, which is far from the reported results. I observe that in the original paper of MoleculeNet, models are evaluated using Random split instead of Scaffold split on SIDER. It seems that the comparison between RF (68.4) and MolCLR (68.0) in the benchmark table is unfair.
We conduct experiments with Random split on SIDER. Here's the results.

We've also tried several hyper-parameter combinations by changing dropout, hidden size and activation functions of MLP, which yields similar results. We hope the authors could kindly offer the experiment settings and hyper-parameters on SIDER that fully reproduces the promising results of MolCLR.

yuyangw commented 1 year ago

Thanks for your interest in our work. A suggested setting for SIDER:

lr: 0.0005
base ratio: 0.4
dropout: 0.3
batch size: 32
pooling: max

Also, tuning the hyperparameters for each task individually can improve performance. Hope this helps.

Best, Yuyang

icycookies commented 1 year ago

Thanks for your response! We've successfully reproduced the reported results on SIDER.

yuyangw / MolCLR

Reproducing benchmark results on the SIDER dataset #15