Add elastic net to the toolset

Banana1530 commented 6 years ago

The elastic net [1] eases the problems caused by high correlation in the features space. However it is not of order 1 and thus violates the requirement.

[1] https://web.stanford.edu/~hastie/Papers/B67.2%20(2005)%20301-320%20Zou%20&%20Hastie.pdf Regularization and variable selection via the elastic net

Banana1530 commented 5 years ago

$L \left( \lambda _ { 1 } , \lambda _ { 2 } , \boldsymbol { \beta } \right) = | \mathbf { y } - \mathbf { X } \boldsymbol { \beta } | ^ { 2 } + \lambda _ { 2 } | \boldsymbol { \beta } | ^ { 2 } + \lambda _ { 1 } | \boldsymbol { \beta } | _ { 1 }$

It is not homogeneous of order 1.

michaelweylandt commented 5 years ago

I think we can actually get by without homogeneity of order 1: e.g., the SCAD and MCP penalties are not PH1, but they are still useful.

If you look at the proofs closely (there's a new version of the ArXiv paper), PH1 is only needed to show that the prox + project step is actually implementing the prox of the sum of two functions and hence we can analyze the convergence as a prox gradient method. I'm pretty sure that we could instead consider a projected proximal gradient (i.e., prox gradient for problems with constraints) method instead, but a quick search doesn't pull anything up.

If we do implement EN, any thoughts on how to deal with double parameters?

DataSlingers / MoMA

Add elastic net to the toolset #16