Closed orgw closed 10 months ago
Hi @orgw,
Thank you for raising this question.
It's because we are comparing the pretraining
modeling, while the paper you referred to:
pretraining
and backbone representation
modeling.pretraining
modeling in this paper you referred, it is using node & edge masking (which is AttrMask
in our result tables).Further, in our paper, we highlighted that our proposed pretraining model (MoleculeSDE) is agnostic to the backbone model.
Hope this answers your question.
Thank you for the clarification!
Hi, thanks for the nice paper+code. according to your paper, I think you missed a lot of SOTA papers regarding molecule properties when comparing. I can see that you're comparing pretrained methods with GIN. However, if you refer to the latest paper for instance https://github.com/HIM-AIM/BatmanNet, scores are way more higher than that in your papers.
for instance BBBP is 0.946 for batmannet and pretrained SMILES-BERT goes over 0.959 for AUC-ROC while the data in yours indicate near 0.8 maximum
Maybe i misunderstood how you compared the models, can you help me understand why there is such a huge gap between the scores?? It is clear that for geometric tasks such as QM9, 2D has low performance. But It's hard for me to understand how 2D + 3D considered representation has lower score compared to only 2D in predicting molecular property. maybe it is due to the dataset size(PCQM4Mv2)?