deepchem / moleculenet

Moleculenet.ai Datasets And Splits
MIT License
88 stars 19 forks source link

What Can We Do Better? #1

Open lilleswing opened 4 years ago

lilleswing commented 4 years ago

The Moleculenet publication has accomplished much in terms of having standardized problems for supervised learning over chemical structures. However over the past couple of years we have seen some barriers to entry in using the datasets. How can we make it easier?

This issue can be a brainstorming page for how to make the MoleculeNet datasets more accessible to Machine Learning Practitioners.

rbharath commented 4 years ago

Here's a couple of my observations so far:

rbharath commented 4 years ago

We should make sure that there's a stable mechanism for splitting datasets that allows for easy benchmarking. This repo has some code that improves the stability (which was an issue in the original MoleculeNet):

https://github.com/shenwanxiang/ChemBench