deepchem / moleculenet

Moleculenet.ai Datasets And Splits
MIT License
88 stars 19 forks source link

New MoleculeNet Datasets #2

Open lilleswing opened 4 years ago

lilleswing commented 4 years ago

Two years have passed since the publication of MoleculeNet. Since then many strides in supervised learning for molecules have been made. Are all of the datasets from the original paper still relevant to the challenges projects are facing today? Are there new datasets that should be added to the benchmark?

rbharath commented 4 years ago

We should add enamine as a dataset: https://enamine.net/library-synthesis/real-compounds/real-compound-libraries

rbharath commented 4 years ago

We should consider adding the crystallography open database. Following up on discussion from https://github.com/deepchem/deepchem/issues/425

rbharath commented 4 years ago

We should consider adding the cambridge structural database. Following up on discussion from https://www.ccdc.cam.ac.uk/solutions/csd-system/components/csd/.

Following up on discussion from https://github.com/deepchem/deepchem/issues/426.

rbharath commented 4 years ago

Following up on the discussion from https://github.com/deepchem/deepchem/issues/867.

We should try to add some more assay binding data.