jgeisler0303 / DDP-Generator

Generate taylored code for Differential Dynamic Programming (DDP) aka Iterative Linear Quadratic Gaussian (iLQG) solvers for finite time Optimal Control Problems (OCP)
GNU General Public License v2.0
15 stars 2 forks source link

Initialization of nominal trajectories #3

Open miquelramirez opened 5 years ago

miquelramirez commented 5 years ago

Hello @jgeisler0303 ,

thanks very much for publishing this code, it has been very useful to get a handle on DDP. I have been working for some time on a C++ implementation of DDP which also relies on symbolic differentiation, and your code solved a couple questions I had, confirming some of the guesses I had made on the initialization of some parameters.

I am having some trouble repeating the experimental results on Tassa's ICRA-14 paper, though. Is your code capable of generating the same trajectory reported in the paper? Looking through the initialization parameters of the example I see that there is an initial nominal control trajectory there. Where did it come from? Was Tassa initializing trajectories as well with something better than u=0?

Many thanks,

Miquel.

miquelramirez commented 5 years ago

Hello again,

on a related note, I am wondering about the arcsin(sin(x)y) term in the car's yaw differential constraint. The Jacobian of the dynamics will contain imaginary numbers when evaluated on plant states were y>= 1? How does Maxima handle that?

Cheers,

Miquel

miquelramirez commented 5 years ago

Okay, after studying your code I figured out that Tassa had uploaded the code to Mathworks (on his former home page at Uni Washington there was a broken link). I saw quite quickly my issue - the sign of the expected cost used during the linear search to accept the trajectory was flipped with respect to what appears on his Thesis and their IROS-12 paper.

I also see that his Matlab code initializes the first nominal trajectory with random controls. That is interesting. Are the values for the initial nominal trajectory in your examples folder also random noise?

Thanks very much for publishing this code, the C++ version you have is very nice and clear. And you have been already immensely helpful.

jgeisler0303 commented 5 years ago

The initial trajectory in my example is also random. It is quite interesting to note, that Tassas problem is so highly nonlinear that very small changes to the algorithm, e.g. different calculation of the derivatives, can totally change the shape of the solution, though all of these solutions are valid and their cost differs very little.

I actually don't remember why there is constraint involving arcsin(sin(x)y) .

I'm happy my code was of help to you. I actually have a local branch where I started to rewrite the code more c++ like and with added state constraints via the augmented Lagrangian method. But I currently don't have the time to continue development.

If you want to continue work with DDP in C I would suggest you have a look at https://github.com/casadi/casadi for the derivatives. I put a lot of work into my symbolic code but now I have a feeling that CasADi can perform much better.

miquelramirez commented 5 years ago

Hi Jens,

thanks very much for you answer, much appreciated.

The initial trajectory in my example is also random.

Thanks for confirming that - I plotted the numbers and looked so too.

It is quite interesting to note, that Tassas problem is so highly nonlinear that very small changes to the algorithm, e.g. different calculation of the derivatives, can totally change the shape of the solution, though all of these solutions are valid and their cost differs very little.

That is a good point, and I can totally see how the initial nominal trajectory can have a "founder effect" on what you get out of the optimization procedure. Being aware of that, I would expect the trajectory to at least reach the vicinty of the target location, using the same horizon and discretization step. I was getting very wild behaviours... sometimes. I guess this illustrates that in numerical optimization of this kind, there's a very fine line between trajectories generated by buggy code, ill conditioned matrices and the actual deal.

As a more general comment, it has become too common in some fields only to report the "best runs" of their algorithms giving a false impression of robustness and precision... That is not a good development in my opinion.

I actually don't remember why there is constraint involving arcsin(sin(x)y) .

It is a way to get rid of the tan(x) term that usually appears in the differential equations of systems with non holonomic constraints (like cars and airplanes its turn radius is given by the tan() of the input multiplied by a function of velocity). arcsin(sin(x)) does a beautiful job to regulate the yaw angle theta, without the discontinuities.

The problem with using arcsin(sin(x)) is that its derivatives can be problematic. I wrote to Tassa a couple days ago asking him about that, I hope I didn't come across as an *sshole. We Spaniards can sound too direct and confrontational sometimes :)

You can make the problem go away by replacing arcsin(sin(x)) by sin(sin(x)). This dampens considerably the turning rate though, losing a lot of efficiency in turning.

I'm happy my code was of help to you. I actually have a local branch where I started to rewrite the code more c++ like and with added state constraints via the augmented Lagrangian method. But I currently don't have the time to continue development.

Me too, actually. I incorporated your implementation of the QP algorithm and I am comparing it with the off the shelf solution I was using - qpOASES. I am at the moment observing some significant differences in the trajectories... but would seem that your implementation of Tassa's code is more robust w.r.t. setting parameters.

If you want to continue work with DDP in C I would suggest you have a look at https://github.com/casadi/casadi for the derivatives. I put a lot of work into my symbolic code but now I have a feeling that CasADi can perform much better.

I am myself using symengine the C++ backend of SymPy for my symbolic manipulation and, of course, differentiation. I quite like that interface and it is very efficient and oozes with useful functionality. If you're curious about my code - which isn't very original to be honest, as I was trying to avoid to separate too much from the literature so I could "verify" my results - I will be happy to add you as a collaborator.

CasADi looks awesome, much more aligned with the current trends you see in the continuous Deep RL community (with their automatically differentiable computation graphs, etc.). It is a very interesting project, honestly, I will take a look through it.

DDP with state constraints would actually be a very valuable contribution, Jens. My own research is in planning and optimization over hybrid systems... being able to convey constraints in a flexible and systematic way to an algorithm like DDP, that can handle arbitrary dynamics (most of the time at least) would be pretty awesome.

Last, if you don't mind, I would like to give you credit in some way: I am working on a paper that relies on DDP for the planning/trajectory optimisation. Would you mind being included in the author's list (there's just three of us)? You don't need to do anything else other than give me your blessing :)