Closed Banana1530 closed 5 years ago
I think we can actually get by without homogeneity of order 1: e.g., the SCAD and MCP penalties are not PH1, but they are still useful.
If you look at the proofs closely (there's a new version of the ArXiv paper), PH1 is only needed to show that the prox + project step is actually implementing the prox of the sum of two functions and hence we can analyze the convergence as a prox gradient method. I'm pretty sure that we could instead consider a projected proximal gradient (i.e., prox gradient for problems with constraints) method instead, but a quick search doesn't pull anything up.
If we do implement EN, any thoughts on how to deal with double parameters?
The elastic net [1] eases the problems caused by high correlation in the features space. However it is not of order 1 and thus violates the requirement.
[1] https://web.stanford.edu/~hastie/Papers/B67.2%20(2005)%20301-320%20Zou%20&%20Hastie.pdf Regularization and variable selection via the elastic net