yngvem / group-lasso

Group Lasso implementation following the scikit-learn API
MIT License
105 stars 32 forks source link

How to adjust the regularization parameters? #17

Closed normanius closed 4 years ago

normanius commented 4 years ago

Hi Yngve

This is the last issues for a while, promised. I spent the weekend with your toolbox, but now I have to move on... :)

I was wondering if there are some rules of thumb how to adjust the regularization parameters in general.

For the following very easy problem (see the data in the left subplot), I found that LogisticGroupLasso has quite some issues if l1_reg and group_reg is increased. The subplot on the righthand side shows the prediction accuracy for different values of l1_reg and group_reg, sampled in a squared grid with the parameters falling inside the range [0,0.02]. One can see that the regression fails already for very small values of the parameter.

image

Questions:

  1. Is this behavior to be expected for low-dimensional problems?
  2. Any advice on parameter tuning? Are there good strategies how to set regularization parameters? Using a grid-search, one might find the stable region. But the stable region for this problem is relatively small.
  3. Since FISTA very frequently fails (just like lbfgs in sklearn's Logistic Regression, by the way!), it is difficult to differentiate the causes for poor prediction results: Is it because the optimization problem was too difficult for the choice of parameters, or is it because the training data was not providing enough information? Any idea how to distinguish the former from the latter?

Thanks!

normanius commented 4 years ago

(Sharing here also the code to reproduce the plots) failing_case.py.zip

yngvem commented 4 years ago

This is fixed in the latest version

normanius commented 4 years ago

Great! It works now. Thanks!

normanius commented 4 years ago

By the way, this is the same plot again as above, but for the value range

l1_reg and group_reg, sampled in a squared grid with the parameters falling inside the range [0,0.5] instead of [0,0.02]. Is guess this is normal, that for large regularization parameters, the problem destabilizes.

image