Closed j-adamczyk closed 1 year ago
We could certainly include it, but now we have a better one called PCQM4Mv2.
In my opinion, QM9 is not so realistic and somewhat solved (SoTA MAE already below the chemical accuracy). May be a bit outdated to include by now.
A few large datasets from MoleculeNet, concerning quantum chemistry, are not implemented in OGB, and also PDBbind dataset. What is the reason for this? I understand that QM datasets typically use specific features, but e.g. this paper use regular features (like Morgan / ECFP / circular fingerprints) and get good results. I think that they could be added to OGB with regular set of features. They can also use scaffold split, like in paper I linked before, similar to other MoleculeNet dataset.