deepchem / moleculenet

Moleculenet.ai Datasets And Splits
MIT License
88 stars 19 forks source link

Two Submissions on Clearance #20

Open mufeili opened 3 years ago

mufeili commented 3 years ago

@rbharath @miaecle This PR is for two submissions (random forest + ECFP & GCN + GC) on Clearance.

Also, it seems that the dataset is small and the labels can have a very different scale, e.g. 0.xx to 22. As a result, the RMSE values are pretty large. See if this is expected. @peastman

peastman commented 3 years ago

Also, it seems that the dataset is small and the labels can have a very different scale, e.g. 0.xx to 22. As a result, the RMSE values are pretty large.

I'm not too familiar with this dataset. That does make sense. Perhaps a different metric would be more appropriate?

mufeili commented 3 years ago

Also, it seems that the dataset is small and the labels can have a very different scale, e.g. 0.xx to 22. As a result, the RMSE values are pretty large.

I'm not too familiar with this dataset. That does make sense. Perhaps a different metric would be more appropriate?

What's the source of the dataset? Have anyone used this before? An alternative metric can be R2.

rbharath commented 3 years ago

Sorry for the slow response! Lost track of this PR in my inbox. It looks like we added the clearance dataset in https://github.com/deepchem/deepchem/pull/484 but we don't have the dataset listed in the original 17 datasets in MoleculeNet v1 for some readon. @miaecle would you happen to remember why we didn't add clearance to the moleculenet v1 datasets?

As a couple of thoughts, perhaps we should log-transform the output? We do this for some regression outputs in which there's a large range of outputs. In that case, the RMS on the logarithmic scale might be meaningful. Another option is swapping to R^2. I'm pretty open to swapping to either given that we didn't include Clearance in v1 so this won't break any existing benchmark standard

miaecle commented 3 years ago

@mufeili @rbharath Sorry I didn't quite remember why/if it is included in the initial version of benchmark. As of metrics I agree with Bharath on log-transforming. Depending on how the label distribution looks like, R2 could also suffer from outliers (assuming those with label~22 are quite rare).