pmelchior / proxmin

Proximal optimization in pure python
MIT License
111 stars 24 forks source link

Modifying AdaProx for LASSO #23

Open pythonometrist opened 3 years ago

pythonometrist commented 3 years ago

This isnt an issue per se. I did want to figure out if I could use a similar approach for a simple LASSO regression in pytorch. Working with proximal operators with SGD is straightforward (but then SGD has step size issues). ADAM requires memory for past gradients - but isn't meant for non-differentiable convex problems (even though L1 regularization does improve results a fair bit). I wanted tos ee if AdaProx improves results.

pmelchior commented 3 years ago

Do you have an elementwise l1 penalty? If so, you can use operators.prox_soft. adaprox should work then.

pythonometrist commented 3 years ago

Thanks- let me dig into it and revert. I am going to evaluate how this compares with a smooth Huber loss for linear regression.