leggedrobotics / ocs2

Optimal Control for Switched Systems
https://leggedrobotics.github.io/ocs2
BSD 3-Clause "New" or "Revised" License
838 stars 220 forks source link

How to "update" the robot observation for the MPC solver by using the robot's measured state info? #27

Closed edward9503 closed 2 years ago

edward9503 commented 2 years ago

Dear legged robotics group,

thanks for providing the OCS2.

Recently, we are going to use OCS2 to generate an online walking trajectory for Anymal C and simulate it in the RaiSim. We have developed our WBC to track the reference trajectory. At the very beginning, we employed the MPC solver of OCS2 only as a pure trajectory generator (meaning there was no robot's state feedback obtained from RaiSim into the MPC). But then, we realized that there was a drift in the z-direction, as mentioned in #24. So we tried to find the solution.

During figuring out a solution, we found that #23 mentioned that we could add some feedback terms to the MPC. Moreover, paper "A Unified MPC Framework for Whole-Body Dynamic Locomotion and Manipulation" also mentioned that we could obtain the state information from the robot and insert it into the SLQ-MPC solver, as shown in figure 3. Therefore, we tried to obtain the state from RaiSim and insert it into the MPC solver.

We've found that there is a user-defined modification function interface in file "ocs2_ros_interface/src/mrt/MRT_ROS_DummyLoop.cpp". In my opinion, the "observation" term should be the real robot state that is the input for the MPC. We replaced the "observation" with the state and input vector that gathered from the anymal in RaiSim. The RaiSim and WBC were ruining at 1kHz, mrtDesiredFrequency was 1khz, and mpcDesiredFrequency was -1 Hz (which was 250Hz in my pc). We tried replacing the "observation" with measured data in 1kHz. It turned out that even though I set the target position, the robot was standing still on the ground without any motion at first, then it would drift, as shown below:
final_61b9f9f92733e70169c019ff_12003

I thought maybe we were updating the "observation_" too frequently, so I set the updating frequency to 1Hz. But in this way, the robot in rviz (on the left hand side) will “flash” from one place to the other, which will again make the robot in RaiSim unstable, as shown here: final_61b9fb2e64bdc400c8ab2751_866206

Of course, we have tried different mrtDesiredFrequency, mpcDesiredFrequency and "observation" updating frequency, but the robot was still unstable. We started to consider that maybe the way we used the measured data for the MPC solver was not correct, because by doing our way, the desired state and input vector seems not continuous, which will, in turn, cause the unstable tracking performance. And also, if we update the "observation" too frequently, then the generated reference trajectory is too close to the current robot state after each time's MPC update, which won't lead the robot to move. So I was wondering:

  1. How to use the robot's measured data for the MPC solver properly?
  2. How should we solve the drift problem in z-direction if we want to use the MPC as a pure trajectory planner (without state feedback)?

Many thanks

Best Shengzhi

mszuyx commented 2 years ago

@edward9503

Hi,

We are also trying to interface with OCS2 with simulation other than the dummy test: #21

So far we are able to verify the interface pipeline among the simulator - ROS - OCS2 works using a cart-pole example. Yet we still have trouble balancing our own robot since the dynamic of it is so complicated and naturally requires more work.

I hope our past conversation with the OCS2 people can help you in some way. And feel free to reach out to us if you like to discuss about it (yuc6@illinois.edu) while waiting for them to response.

In my humble opinion, you can also start with a robot with simple dynamic to verify the interface pipeline first, then work your way to much complicated systems.

edward9503 commented 2 years ago

@mszuyx

Hi,

Thanks for your reply. I've sent you an email for more detail. Thanks a lot!

rubengrandia commented 2 years ago

Feeding the measured state back to the MPC is correct as you describe it. The devil is probably in the details in this case.

The drift in the base that you show is in orientation. It might be worth it to double check the orientation convention on both sides and the conversion between them.

Also, you mention that you use a WBC to track the reference trajectory. In this WBC are you adding additional feedback on the base pose and twist errors? It might be worth it to set those to zero and simply track the desired base acceleration coming from the MPC. In some cases the additional feedback terms there can destabilize the closed loop system because the MPC itself is already providing feedback on the measured state.

This relates to your other comment: "And also, if we update the "observation_" too frequently, then the generated reference trajectory is too close to the current robot state after each time's MPC update"

The main component of the system inputs should come from the MPC (as a feedforward in your WBC). The error between the measured state and the MPC predicted state should not be the main driver of your motion. Think about it this way: If the model was perfect these errors should stay zero all the time.

This drift happens over a rather long execution time. If you are doing a 1-2 second motion, this should not be an issue. Over longer horizons you could consider correcting for the drift on the tracking side by estimating where the terrain is and shifting the reference motion accordingly.

edward9503 commented 2 years ago

@rubengrandia

Hi,

Thanks for your reply. We have some updates here!

Updates: Now our quadruped can walk! Indeed, there was a tiny calculation error from quaternion to Euler ZYX convention. After fixing it, it worked.

Relates to your comment: "Also, you mention that you use a WBC to track the reference trajectory. In this WBC are you adding additional feedback on the base pose and twist errors? It might be worth it to set those to zero and simply track the desired base acceleration coming from the MPC. In some cases the additional feedback terms there can destabilize the closed loop system because the MPC itself is already providing feedback on the measured state."

-Simply only tracking the desired acceleration is a good suggestion, because we also realized that the small segment at the beginning of the generated reference trajectory from MPC is approximately equal to the current observation (for example, the t = 0-0.01s part of an MPC generated reference trajectory with time horizon T = 1s is quite close to the current observation state), which is equivalent to your comment "because the MPC itself is already providing feedback on the measured state". This will lead to the divergence of the robot since the model inaccuracy always exists. So we will also give this method a try. Alternatively, we figured out an approach to handle the aforementioned instability caused by the additional feedback on the base pose and twist errors. Our current walkable WBC looks like this: we are adding the full state and input vector feedback (i.e., on 6D momenta, robot base pose, legs joint angles for state vector, and on contact forces and legs joint velocity for input vector) to MPC, but only tracking the base pose and leg joint angles for initial testing. To enable the robot to walk and alleviate this instability caused by feeding the desired base pose and twist from MPC to WBC forward, we get smoother MPC reference trajectories (including the desired base pose and twist) for tracking by adding a low pass filter to the real robot observation state feedback. This helped to stabilize the whole closed-loop system. As a result, the quadruped can walk nicely. Do you have any opinion on this approach?

Thanks again

Best

rubengrandia commented 2 years ago

Happy to hear it works.

By adding a low pass filter in between you are adding some artificial phase shift / delay, which might hinder your performance for more dynamic motions. Either way, it is always good to have something working and start increasing the complexity from there. Good luck!