Following up on #76, since early stopping didn't really seem to work as a form of regularization, we wanted to explore some different ways to regularize neural networks that we could use to study generalization.
We tried varying the hidden layer size, the amount of weight decay, and the amount of dropout. For the most part, they look similar to the results for LASSO when there's a large amount of regularization, but they don't overfit in the same way that the LASSO models did when there's a small amount of regularization. Here are the curves for KRAS (we also looked at EGFR and the results are similar):
Following up on #76, since early stopping didn't really seem to work as a form of regularization, we wanted to explore some different ways to regularize neural networks that we could use to study generalization.
We tried varying the hidden layer size, the amount of weight decay, and the amount of dropout. For the most part, they look similar to the results for LASSO when there's a large amount of regularization, but they don't overfit in the same way that the LASSO models did when there's a small amount of regularization. Here are the curves for KRAS (we also looked at EGFR and the results are similar):