deepchem / moleculenet

Moleculenet.ai Datasets And Splits
MIT License
88 stars 19 forks source link

Include Preprocessing Information #8

Open rbharath opened 4 years ago

rbharath commented 4 years ago

A number of the datasets in Moleculenet are processed in some way from existing sources. We need to provide documentation and scripts demonstrating how the MoleculeNet versions were generated from the source data. The original version of MoleculeNet didn't include these details, leading to a lot of hard questions on data sourcing such as:

For this new version of MoleculeNet we should store source dataset information, and processing information, for all datasets we add.