Open MicPie opened 1 year ago
I have many of the tabular datasets here https://www.dropbox.com/sh/oqisd84vyt97z1i/AADxgPu_ESJBKlYpqLmtKwjya?dl=0
also the "solubility test set" proposed by Pat Walters in his blog post http://practicalcheminformatics.blogspot.com/2018/09/predicting-aqueous-solubility-its.html
For a large solubility dataset I'd look into AquasolDB
I found data that is ready to use here: https://www.kaggle.com/code/mmelahi/physical-chemistry-lipophilicity/data And data found here: https://ecbd.eu/
is anyone working on this? if not, i'd be happy to!
The “small” training data set is available in the supporting information: https://pubs.acs.org/doi/10.1021/ci034243x
I created an issue in the ESOL repo to ask for the full data: https://github.com/hossainlab/ESOL/issues/1
Worst case we can only use the small subset.