RobotLocomotion / drake

Model-based design and verification for robotics.
https://drake.mit.edu
Other
3.35k stars 1.27k forks source link

L1 vs L2 objective minimization using SNOPT? #1157

Closed aespielberg closed 9 years ago

aespielberg commented 9 years ago

Hi all,

I have a simple Acrobot plant model, and have been performing trajectory optimization on it, and I've been comparing the effect of L1 vs. L2 optimization on an objective. My objective is on parameters, which I've made part of my optimization method. I want to penalize varying the parameters too much, and for that, I'm comparing L1 and L2 regularization.

Right now I am dealing with only one cost function, a final cost, which looks like this:

function [f, df] = parameter_regularizer(obj, T, x, params)
            param_values = obj.robot.getParams();
            w = obj.regularizer_weight;
            f = w*norm(double(param_values) - params, 1);
            df = [0, zeros(length(x), 1)', w*sign(params - double(param_values))'];
end

What I've noticed is that L1 minimization takes MUUUUUCH longer than L2 minimization: we are talking ~300 seconds vs. ~6 seconds. and does not fully satisfy the constraints. Even though one of my coordinates was constrainted to be 0 <= constraint_val <= 0; I found that the constraint_val was something like 0.008, where I know the standard slack is usually much much smaller. I also watched the constrained value jump around wildly, from being nearly satisfied to suddenly jumping to something like 42. I also noticed the following printout in my run:

Warning:   41 current point cannot be improved

> In NonlinearProgram/mapSolverInfo (line 1544)
  In NonlinearProgram/snopt (line 1420)
  In NonlinearProgram/solve (line 973)
  In DirectTrajectoryOptimization/solveTraj (line 188)
  In KinematicDirtranTest/solveTraj (line 90)
  In AcrobotPlantTest/swingUpTrajectoryKinematic (line 212)
  In runSwingUpKinematicTest (line 6) 

I know L1 minimization problems are harder than L2 minimization problems because of the fact that it's nonsmooth, and also the fact that the second derivative is basically zero everywhere - which means certain optimization methods don't have a lot of information to go off of. However, I honestly didn't expect just adding this objective function, which has only 6 dimensions, would cause the compute time to balloon so much - and I didn't expect it to be so different from having an L2 objective or a nonexistent objective, both of which have nearly trivial compute time (~6s).

Long-story short: Does SNOPT and the Drake interface to snopt have ways of improving optimization time?

a) Are there tools tailored to L1 minimization problems in particular, or known settings that work well for them? b) Where can I see documentation on what options I can control in SNOPT optimization calls? c) Precisely what algorithm does SNOPT use? I noticed that there's some stochasticity sometimes between runs - where does this nondeterminism come from? Is there a way to select between different optimization algorithms (e.g. gradient descent, trust-region, etc.)? d) Does this result make sense, that the trajectory optimization would have so much trouble with this objective?

-Andy S.

RussTedrake commented 9 years ago

The code in NonlinearProgram/snopt should be very penetrable. You can see the options there. And the online snopt documentation is easy to find and helpful.

If you see info>13 coming out of snopt, it probably means something is wrong with your problem formulation. For instance, you might be providing derivatives that don't match your function. I would look hard for a bug like that first.

On Jul 11, 2015, at 10:46 PM, aespielberg notifications@github.com wrote:

Hi all,

I have a simple Acrobot plant model, and have been performing trajectory optimization on it, and I've been comparing the effect of L1 vs. L2 optimization on an objective. My objective is on parameters, which I've made part of my optimization method. I want to penalize varying the parameters too much, and for that, I'm comparing L1 and L2 regularization.

Right now I am dealing with only one cost function, a final cost, which looks like this:

function [f, df] = parameter_regularizer(obj, T, x, params) param_values = obj.robot.getParams(); w = obj.regularizer_weight; f = w_norm(double(param_values) - params, 1); df = [0, zeros(length(x), 1)', w_sign(params - double(param_values))']; end What I've noticed is that L1 minimization takes MUUUUUCH longer than L2 minimization: we are talking ~300 seconds vs. ~6 seconds. and does not fully satisfy the constraints. Even though one of my coordinates was constrainted to be 0 <= constraint_val <= 0; I found that the constraint_val was something like 0.008, where I know the standard slack is usually much much smaller. I also watched the constrained value jump around wildly, from being nearly satisfied to suddenly jumping to something like 42. I also noticed the following printout in my run:

Warning: 41 current point cannot be improved

In NonlinearProgram/mapSolverInfo (line 1544) In NonlinearProgram/snopt (line 1420) In NonlinearProgram/solve (line 973) In DirectTrajectoryOptimization/solveTraj (line 188) In KinematicDirtranTest/solveTraj (line 90) In AcrobotPlantTest/swingUpTrajectoryKinematic (line 212) In runSwingUpKinematicTest (line 6) I know L1 minimization problems are harder than L2 minimization problems because of the fact that it's nonsmooth, and also the fact that the second derivative is basically zero everywhere - which means certain optimization methods don't have a lot of information to go off of. However, I honestly didn't expect just adding this objective function, which has only 6 dimensions, would cause the compute time to balloon so much - and I didn't expect it to be so different from having an L2 objective or a nonexistent objective, both of which have nearly trivial compute time (~6s).

Long-story short: Does SNOPT and the Drake interface to snopt have ways of improving optimization time?

a) Are there tools tailored to L1 minimization problems in particular, or known settings that work well for them? b) Where can I see documentation on what options I can control in SNOPT optimization calls? c) Precisely what algorithm does SNOPT use? I noticed that there's some stochasticity sometimes between runs - where does this nondeterminism come from? Is there a way to select between different optimization algorithms (e.g. gradient descent, trust-region, etc.)? d) Does this result make sense, that the trajectory optimization would have so much trouble with this objective?

-Andy S.

— Reply to this email directly or view it on GitHub.

hongkai-dai commented 9 years ago

I would expect SNOPT to behave poorly for this L1 penalization problem. We found that SNOPT can be profoundly unhappy when the problem is not smooth. Unfortunately, when you use L1 penalization, the non-smoothness happens right at the optimal solution.

The stochasticity comes from the previous run. If you do a megaclear or clear classes every time after you call SNOPT, I believe the stochasticity should go away.

I found a paper Fast Optimization Methods for L1 Regularization: A Comparative Study and Two New Approaches on using sequential quadratic programming for L1 penalized problem. You could try the sub-gradient method.

Also, there is a trick to reformulate the L1 penalization problem to a smooth nonlinear optimization problem, by examine the optimality condition of the L1 penalization problem. It could be reformulated using complementarity constraints. Here https://dl.dropboxusercontent.com/u/21237593/formulation.pdf is a quick introduction to this approach that I wrote a few years back. @mposa, do you think this is a reasonable approach? This approach will require you to reformulate the problem and a lot of extra coding, but it should give you a problem with smooth gradient in the end. I would recommend to try the sub-gradient method first.

jrebula commented 9 years ago

I've run into this issue (albeit with a different application and solver), and the solution presented in that paper, section 2.2 worked for me. Replace |x| with sqrt(x*x + epsilon), with some small epsilon. It just smooths the nasty point in abs (so the minimum is at the same x), and it looks like it would be trivial to change your code to implement it.

On Sat, Jul 11, 2015 at 8:08 PM, Hongkai Dai notifications@github.com wrote:

I would expect SNOPT to behave poorly for this L1 penalization problem. We found that SNOPT can be profoundly unhappy when the problem is not smooth. Unfortunately, when you use L1 penalization, the non-smoothness happens right at the optimal solution.

The stochasticity comes from the previous run. If you do a megaclear or clear classes every time after you call SNOPT, I believe the stochasticity should go away.

I found a paper Fast Optimization Methods for L1 Regularization: A Comparative Study and Two New Approaches on using sequential quadratic programming for L1 penalized problem. You could try the sub-gradient method.

Also, there is a trick to reformulate the L1 penalization problem to a smooth nonlinear optimization problem, by examine the optimality condition of the L1 penalization problem. It could be reformulated using complementarity constraints. Here https://dl.dropboxusercontent.com/u/21237593/formulation.pdf is a quick introduction to this approach that I wrote a few years back. @mposa https://github.com/mposa, do you think this is a reasonable approach? This approach will require you to reformulate the problem and a lot of extra coding, but it should give you a problem with smooth gradient in the end. I would recommend to try the sub-gradient method first.

— Reply to this email directly or view it on GitHub https://github.com/RobotLocomotion/drake/issues/1157#issuecomment-120682692 .

RussTedrake commented 9 years ago

thanks all, for the advice. will close this to take it off the stack. @aespielberg -- please let us know how it works

aespielberg commented 9 years ago

Thanks. I actually just gave the sqrt(x* x' + eps) trick (which is an excellent trick that I'd never heard of before) a try late last night (sorry, forgot to post last evening), and that converged lightning fast. If that stops working for othe methods I'll probably try other techniques and let you all know how they work out.

-Andy S.

Andrew Spielberg PhD Student MIT - Computer Science and Artificial Intelligence Laboratory

On Mon, Jul 13, 2015 at 7:49 PM, Russ Tedrake notifications@github.com wrote:

thanks all, for the advice. will close this to take it off the stack. @aespielberg https://github.com/aespielberg -- please let us know how it works

— Reply to this email directly or view it on GitHub https://github.com/RobotLocomotion/drake/issues/1157#issuecomment-121093143 .