chao1224 / MoleculeSDE

A Group Symmetric Stochastic Differential Equation Model for Molecule Multi-modal Pretraining, ICML'23
https://chao1224.github.io/MoleculeSDE
MIT License
31 stars 3 forks source link

Question regarding molecule property scores #2

Closed orgw closed 10 months ago

orgw commented 10 months ago

Hi, thanks for the nice paper+code. according to your paper, I think you missed a lot of SOTA papers regarding molecule properties when comparing. I can see that you're comparing pretrained methods with GIN. However, if you refer to the latest paper for instance https://github.com/HIM-AIM/BatmanNet, scores are way more higher than that in your papers.

for instance BBBP is 0.946 for batmannet and pretrained SMILES-BERT goes over 0.959 for AUC-ROC while the data in yours indicate near 0.8 maximum

Maybe i misunderstood how you compared the models, can you help me understand why there is such a huge gap between the scores?? It is clear that for geometric tasks such as QM9, 2D has low performance. But It's hard for me to understand how 2D + 3D considered representation has lower score compared to only 2D in predicting molecular property. maybe it is due to the dataset size(PCQM4Mv2)?

chao1224 commented 10 months ago

Hi @orgw,

Thank you for raising this question.

It's because we are comparing the pretraining modeling, while the paper you referred to:

Further, in our paper, we highlighted that our proposed pretraining model (MoleculeSDE) is agnostic to the backbone model.

Hope this answers your question.

orgw commented 10 months ago

Thank you for the clarification!