Understand, protect against, and benchmark possible overfitting

chevrm commented 5 years ago

Have you checked the list of proposed rules to see if the rule has already been proposed?

[X] Yes

Feel free to elaborate, rant, and/or ramble.

Any citations for the rule? (peer-reviewed literature preferred but not required)

DOI

evancofer commented 5 years ago

It is probably worth mentioning dropout (discussed here) and weight decay, both of which are extremely good for limiting overfitting.

rasbt commented 5 years ago

Just throwing in some references ... a general one regarding easily picking up spurious correlations (or rather systematic noise) is

Jo, J., & Bengio, Y. (2017). Measuring the tendency of CNNs to Learn Surface Statistical Regularities. arXiv preprint arXiv:1711.11561.

(There were some other ones similar to this that I can't recall top of my head.)

Regarding correlation of training set size and performance in a bio application:

Mayr A, Klambauer G, Unterthiner T, Steijaert M, Wegner JK, Ceulemans H, Clevert D-A, Hochreiter S: Large-scale comparison of machine learning methods for drug target prediction on ChEMBL. Chemical Science 2018, 9:5441–5451.

although they found that this was equally true for traditional ML. So maybe we need to add a general DL paper to highlight to be careful when deciding upon traditional vs DL when training sets are small.

A bio-related one

Koutsoukas A, Monaghan KJ, Li X, Huan J: Deep-learning: investigating deep neural networks hyper-parameters and comparison of performance to shallow methods for modeling bioactivity data. Journal of Cheminformatics 2017, 9.

where they found that naive Bayes was better for noisy datasets

agitter commented 5 years ago

There is a lot to like in that Mayr et al. 2018 paper. They were very thorough in exploring different models, hyperparameters, and chemical featurizations. However, they only consider ROC as an evaluation metric, and most of the ChEMBL targets are highly class-imbalanced.

Benjamin-Lee / deep-rules

Understand, protect against, and benchmark possible overfitting #28