loco-3d / crocoddyl

Crocoddyl is an optimal control library for robot control under contact sequence. Its solver is based on various efficient Differential Dynamic Programming (DDP)-like algorithms
BSD 3-Clause "New" or "Revised" License
845 stars 173 forks source link

terminal cost computation #962

Closed andreadelprete closed 3 years ago

andreadelprete commented 3 years ago

Hi,

I just realized that crocoddyl handles the terminal cost in an unusual way, which is making my life difficult, and is also leading to useless computations. Typically the terminal cost in an optimal control problem depends only on the terminal state. In Crocoddyl instead, the terminal cost is computed calling the calc method of an action model with zero control inputs. This means that:

This thing gets even worse when using high-order integration schemes (which I am currently playing with) because the extra useless computation associated with the terminal state are even larger, and more importantly it's basically impossible to formulate two "equivalent" problems using different integration schemes.

If my understanding of the problem is correct, I suggest we fix this by introducing a new method of ActionModel to compute the terminal cost without integrating the dynamics.

proyan commented 3 years ago

Hi Andrea, Yes, you are right. We seem to be doing additional computations, which we don't have to do at the terminal state. Thanks for noticing this. I'll add a simple update(data, x) function, that updates the data based on the current state value, and fix the solver routine and calc accordingly. I think it should solve this issue. Do you see anything else that needs to be done?

nmansard commented 3 years ago

Discussing with Andrea: there is the variable enable_integration in the integration-euler action model. Set it to false, and you will solve the issue. Yet it is maybe not clear for the user that the flag should be set to false for the terminal node. We should maybe check it and warn the user in the solver?

proyan commented 3 years ago

enable_integration would not integrate the terminal model, as it is set to false by default with dt=0.

However, the issue that Andrea raised about unnecessary calculations still stands. In the euler integrator, the differential model's calc is called before the integration step. And we only have the DAM::calc to update the internal values to calculate the cost. So we are calling aba while what we need to call is simply forwardKinematics or so.

https://github.com/loco-3d/crocoddyl/blob/master/include/crocoddyl/core/integrator/euler.hxx#L61

For terminal model, we could simply replace https://github.com/loco-3d/crocoddyl/blob/master/src/core/solvers/ddp.cpp#L335 by an update function, and remove the enable_integration from IntegratedActionModel. This would let us have dt=0 in runningModels as well.

nmansard commented 3 years ago

Could we keep the default behaviour in the nominal action model, to avoid to explicitly implement the terminal behaviour in new action classes (in particular when prototyping)?

cmastalli commented 3 years ago

For terminal model, we could simply replace https://github.com/loco-3d/crocoddyl/blob/master/src/core/solvers/ddp.cpp#L335 by an update function, and remove the enable_integration from IntegratedActionModel. This would let us have dt=0 in runningModels as well.

I don't think we need to include an extra update function. We could use calc(data, x) for that, we just need to define it as a virtual function and to implement the extra bits in each diff/action model.

cmastalli commented 3 years ago

This won't require to create an extra "terminal action" (which I also don't like the idea) or to adapt user code as the default calc(data,x) implementation will still make the job.

andreadelprete commented 3 years ago

I agree with @cmastalli that we don't need an extra update function, but could simply use calc(data,x), making it virtual in ActionModelAbstract. I am also in favor of removing the enable_integration flag. I have started fixing this problem in my local branch, but it's on top of all the changes that are in my last PR, so I could take care of fixing this if you first review and accept that PR.

cmastalli commented 3 years ago

@andreadelprete sorry for not being explicit. I also agree with removing that flag ;)

proyan commented 3 years ago

Agreed, just calc(data, x) should be enough.

proyan commented 3 years ago

I could take care of fixing this if you first review and accept that PR.

I"ll provide with a review this evening.

cmastalli commented 3 years ago

I agree with @cmastalli that we don't need an extra update function, but could simply use calc(data,x), making it virtual in ActionModelAbstract. I am also in favor of removing the enable_integration flag. I have started fixing this problem in my local branch, but it's on top of all the changes that are in my last PR, so I could take care of fixing this if you first review and accept that PR.

Hey @andreadelprete, we (all) agree with the proposition written here. I just wonder if you're currently working on it. Please let me know if you need further support on this topic.

andreadelprete commented 3 years ago

@cmastalli Unfortunately not, I'm not working on this at the moment, so if you wanna do it, please go ahead.

cmastalli commented 3 years ago

@andreadelprete -- I am working in some improvements in the RK4 integrator. Please do not work on this topic yet

cmastalli commented 3 years ago

@andreadelprete -- I am starting to work in the discussed solution.

cmastalli commented 3 years ago

Could we keep the default behaviour in the nominal action model, to avoid to explicitly implement the terminal behaviour in new action classes (in particular when prototyping)?

@nmansard -- Now that I am working on this issue, I realize that this is a dangerous policy. Below, I describe this statement for action and differential action models.

For an action model, we expect that the next state is equals to the current state in the terminal node. However, if we apply u=zeros then the next state will be described by the autonomous evolution of the system dynamics. This is typically different to the current state as f(x) is typically different to the identity function.

Instead, for differential action models, we expect that the system acceleration is zero in the terminal node. However, when we apply u=zeros, we obtain that the system acceleration depends on the Coriolis effect and gravity field (i.e., it is not necessary zero). This doesn't produce any incorrect behavior for free dynamics as the acceleration is not a decision variable and we won't integrate the system acceleration; however, it does for contact dynamics due to it affects the contact forces and respectively any cost (or constraint) based on it.

Please also note that we are currently doing wrong computation in the terminal node.

In short, I propose to make calc(data, x) and calcDiff(data, x) pure virtual functions and deprecate the default behaviour.

@nmansard, @andreadelprete and @proyan -- could you let me know if you're in favor or not to my proposition? If not, then let me know what it is wrong in my analysis/thoughts.

andreadelprete commented 3 years ago

For an action model, we expect that the next state is equals to the current state in the terminal node.

Actually I would expect that the next state is not even computed in the terminal node.

However, if we apply u=zeros then the next state will be described by the autonomous evolution of the system dynamics. This is typically different to the current state as f(x) is typically different to the identity function.

In the default behavior that @nmansard suggested to maintain, the differential dynamics was evaluated with u=0, but then it was not integrated when enable_integration=false. What I think we need to do is to move this part of the code of the integratedActionModels from calc(data, x, u) to calc(data, x).

Instead, for differential action models, we expect that the system acceleration is zero in the terminal node.

Again, I would expect the acceleration not to be computed at all in the terminal node.

however, it does for contact dynamics due to it affects the contact forces and respectively any cost (or constraint) based on it.

The cost in the terminal node cannot depend on u, and neither can it depend on the contact forces.

Please also note that we are currently doing wrong computation in the terminal node.

Not clear to me which computations you are referring to.

In short, I propose to make calc(data, x) and calcDiff(data, x) pure virtual functions and deprecate the default behaviour.

This is a design choice. Providing a default implementation is a benefit for an expert user who knows what they're doing, because it can spare them from the effort of implementing calc(data, x) while prototyping an action model that is not meant to be used for terminal nodes. On the other hand, the default implementation could send the wrong message to a novice user, making them think that the default implementation will do the correct thing when the action model is used for terminal nodes, which is not the case. So in the end it boils down to choosing whether you wanna do the best thing for the expert or for the novice.

cmastalli commented 3 years ago

Actually I would expect that the next state is not even computed in the terminal node. Again, I would expect the acceleration not to be computed at all in the terminal node.

I agree with both statements and indeed I am not proposing to do so. I was just using this fact to elaborate my point.

In the default behavior that @nmansard suggested to maintain, the differential dynamics was evaluated with u=0, but then it was not integrated when enable_integration=false. What I think we need to do is to move this part of the code of the integratedActionModels from calc(data, x, u) to calc(data, x).

Yes, I considered this proposition in my analysis. However, this policy is dangerous if we consider contact-force costs (e.g., force regularization and friction-cone penalisation).

The cost in the terminal node cannot depend on u, and neither can it depend on the contact forces.

Yes, I fully agree with the first statement, but not convinced with the second one. Why do you say "neither can it depend on the contact forces".

The contact forces, for a rigid contact model, depends on both state and control. To clarify this fact, you might review Eq. 2 and 4 of Crocoddyl paper: https://cmastalli.github.io/publications/crocoddyl20icra.pdf. Furthermore, you can also see that the function lambda=g(x,u=0) produces a system acceleration different to zero when we apply u=0. Therefore, the contact force doesn't resemble the terminal node condition.

I have also checked this statement with our code. In the draft PR, you can also see that the test_action unit-test is failing if we use the default implementation and include contact-force cost: https://github.com/loco-3d/crocoddyl/pull/995/checks?check_run_id=3551140728#step:4:721 (note that you will need to run yourself for more details). Indeed, this was the main purpose of creating a draft PR, i.e., to show evidence of my statement.

Another solution is to disable any contact-force cost (and in future constraint) that depends on the contact forces (despite that I suspect that is useful to have force costs, e.g., force regulatisation terms in the terminal node). However, this still supports my statement that current default behavious (i.e., u=0) in the terminal node might leads to wrong computations.

Not clear to me which computations you are referring to.

For more clarity, I refer to computations of cost terms that depend on the contact forces:

This is a design choice. Providing a default implementation is a benefit for an expert user who knows what they're doing, because it can spare them from the effort of implementing calc(data, x) while prototyping an action model that is not meant to be used for terminal nodes. On the other hand, the default implementation could send the wrong message to a novice user, making them think that the default implementation will do the correct thing when the action model is used for terminal nodes, which is not the case. So in the end it boils down to choosing whether you wanna do the best thing for the expert or for the novice.

I agree, this is a design choice. However, providing a default implementation seems to be more convenient for expert developers rather than novice one. An expert developer will understand easily than the default implementation might lead to wrong computation, I doubt novice developers might be aware of it as they wiil use the default implementation.

I also believe that the suggested policy by @nmansard is error prone even for expert developers (e.g., we were doing this wrongly and until now we realize it). However, I am fine to make this compromise if everyone believe so. However, my personal advice/inclination is to do not do so.

andreadelprete commented 3 years ago

The cost in the terminal node cannot depend on u, and neither can it depend on the contact forces.

Yes, I fully agree with the first statement, but not convinced with the second one. Why do you say "neither can it depend on the contact forces".

I said that because the contact forces depend on u, and u is not defined for the terminal node, so the contact forces cannot be defined either. The same goes for the accelerations, which also depend on u. So, to sum up, I think the terminal cost should only depend on the robot configuration and velocity, whereas it cannot depend on joint torques, contact forces and accelerations.

The next question that pops into my mind is: can we design the code so that it's not possible for the user to have a terminal cost that depends on quantities it shouldn't depend on? I guess the answer is "yes, but it would require some work". Maybe we should introduce a new base class for representing terminal costs, which has a different API from that of running costs. Then also the running action models should have a different API from terminal action models. This should guarantee that users cannot mix things, e.g. use a running cost in a terminal action model.

Not clear to me which computations you are referring to.

For more clarity, I refer to computations of cost terms that depend on the contact forces:

Ok, got it now, thanks!

I also believe that the suggested policy by @nmansard is error prone even for expert developers (e.g., we were doing this wrongly and until now we realize it). However, I am fine to make this compromise if everyone believe so. However, my personal advice/inclination is to do not do so.

I agree that this is tricky even for expert developers, and in general I think it makes sense to favor the novice user over the expert.

cmastalli commented 3 years ago

I said that because the contact forces depend on u, and u is not defined for the terminal node, so the contact forces cannot be defined either

The contact forces depends on the configuration q and generalized velocities v as well (including the cases when we expect that the generalized acceleration is zero). And as I said before, passing u=0 to the contact dynamics will lead with system accelerations different to zero, thus we really need to implement properly the computation of the contact forces and their derivatives in calc(data, x) and calcDiff(data, x), respectively.

The next question that pops into my mind is: can we design the code so that it's not possible for the user to have a terminal cost that depends on quantities it shouldn't depend on?

I don't understand which quantities are you referring here. If you refer to contact forces, you might see it as a function of the state in the terminal node.

In general, I don't understand the motivation behind this question. Could you elaborate it more?

cmastalli commented 3 years ago

@andreadelprete -- I see two options.

  1. Assuming that contact-forces cost won't apply to terminal nodes
  2. Handling contact-forces in terminal nodes

The first option will be very cheat to compute as we don't need to compute the contact forces when the acceleration is zero. Same will be for the partial derivatives of the contact forces. Therefore, we need to disable any computation related with the contact forces. This means to define empty r=Rx=0 in the terminal node for each contact-force residual.

The second option it might produce numerical issues in the algorithm convergence as the system acceleration might be abruptly change, which as consequence produces big contact forces. However, I am not sure about this statement. In short, it might be not convenient to have such kind of formulations.

@nmansard, @jcarpent -- what are your thoughts regarding this topic?

andreadelprete commented 3 years ago

The contact forces depends on the configuration q and generalized velocities v as well (including the cases when we expect that the generalized acceleration is zero). And as I said before, passing u=0 to the contact dynamics will lead with system accelerations different to zero, thus we really need to implement properly the computation of the contact forces and their derivatives in calc(data, x) and calcDiff(data, x), respectively.

Sure, forces depend ALSO on q and v, but that doesn't make the dependence on u less important. Since forces depend on q, v and u, you need all three of these quantities to compute forces, but u is not defined in the terminal node, so how can you compute forces in the terminal node??? Reading between the lines of your comments, I think that you're suggesting to use a value of u such that the accelerations are zero? If that's the case, I don't think that's a good idea. In general we cannot even be sure that such a value exists.

The next question that pops into my mind is: can we design the code so that it's not possible for the user to have a terminal cost that depends on quantities it shouldn't depend on?

I don't understand which quantities are you referring here. If you refer to contact forces, you might see it as a function of the state in the terminal node.

In general, I don't understand the motivation behind this question. Could you elaborate it more?

Yes, I was referring to contact forces and control inputs. If you wanna compute contact forces as a function of state only you need some assumption because actually contact forces also depend on u. You might for instance decide that you compute the forces using u=0, or using u such that ddq=0. I don't like any of these options. I think we should stick to the standard OCP formulation, which requires the terminal cost to depend on x only. If users want to do crazy stuff computing forces based on some arbitrary assumptions they can still do it in their cost functions.

andreadelprete commented 3 years ago

@andreadelprete -- I see two options.

1. Assuming that contact-forces cost won't apply to terminal nodes

2. Handling contact-forces in terminal nodes

I would go for 1.

cmastalli commented 3 years ago

You might for instance decide that you compute the forces using u=0, or using u such that ddq=0. I don't like any of these options

Yes, we currently have the former option. But, to be honest, I have the feeling that the latter would be worse and it seems you agree with this by writing "In general we cannot even be sure that such a value exists".

I would go for 1.

I also agree with you!

If there is no objection in this topic, then I would implement it in the #995.

Thanks @andreadelprete for the valuable feedback ;)

cmastalli commented 3 years ago

Solved in #995