utiasDSL / gym-pybullet-drones

PyBullet Gymnasium environments for single and multi-agent reinforcement learning of quadcopter control
https://utiasDSL.github.io/gym-pybullet-drones/
MIT License
1.18k stars 348 forks source link

Difference between using thrust and speed in RL #97

Closed HimGautam closed 2 years ago

HimGautam commented 2 years ago

Hi @JacopoPan, My question might be dumb, but is there a difference between using Motor thrust and Motor speed as my actions in RL. I am currently using RPM in my project but some people on reddit suggested I should out motor thrust and use a low level PD controller to get the speed. What do you think is correct and also how can I use motor thrust method in your simulator?

JacopoPan commented 2 years ago

Hi @HimGautam

I have already mentioned in this repo but this recent ICRA paper by Elia Kaufmann tries to answer exactly that question—should we use RPMs, thrust and turn rates, or velocity vectors when learning to control a quadrotor?. And yes, it suggest thrust and turn rates are a good comprimise between the two other extremes.

The default in this repo is the lowest level of control (RPMs, ~ to SRT in the paper). From what I understood, you are using the velocity input (which is the opposite extreme and equivalent to the higher level LV control mentioned in that paper, leveraging the internal PID controller).

gym-pybullet-drones aviaries do not have an already implemented standalone PD controller for the thrust turn rates (CTBR in the paper) but you should be able to implement one by reusing the method responsible for the lower level control loop in DSLPIDControl: https://github.com/utiasDSL/gym-pybullet-drones/blob/36da0bfe41c34b4f2efb0f1acfdf04886d970d6b/gym_pybullet_drones/control/DSLPIDControl.py#L202

HimGautam commented 2 years ago

Hi @JacopoPan, Paper uses thrust instead of RPM in SRT (single rotor thrust). I am using RPM as my actions (SRT in the paper). While using a simple position and orientation error. I got the result shown in the video below. The agent can't track the position accurately and is also having some jiterring effect. Can this problem solved by using a low level PD controller?

https://user-images.githubusercontent.com/70597091/174144019-af319a45-cd52-43f3-aab3-39ee6d2d6e14.mp4

JacopoPan commented 2 years ago

The relations between PWM, RPM, and single motor thrust are almost linear (see https://www.bitcraze.io/documentation/repository/crazyflie-firmware/master/functional-areas/pwm-to-thrust/) and certainly linear in the range of interested as well as in the model implemented in this repo.

I saw your video and commented in #96, the jiterring might be a problem with the RL agent but the over/undershooting (w.r.t. the target cube) of the position where the drone stabilizes make me think the problem might be with your observation/reward specifications.

HimGautam commented 2 years ago

In drone.getStateVector(0) command we get, X Y Z Q1 Q2 Q3 Q4 R P Y VX VY VZ WX WY WZ P0 P1 P2 P3. Which of these measurements are from body frame, and which are from inertia frame. Also, how can I access the rotation matrix of Drone.

JacopoPan commented 2 years ago

You should have a look at PyBullet quick start guide https://docs.google.com/document/d/10sXEhzFRSnvFcl3XxNGhnD4N2SedqwdAvK3dsihxVUA/edit#heading=h.2ye70wns7io3 @HimGautam , it will help you understand what information is queried from Bullet and returned by the environment

HimGautam commented 2 years ago

Thanks for the reply so quicky @JacopoPan, I got my answer that all measurements are in World frame. But can you please tell me the command for computing rotation matrix.

HimGautam commented 2 years ago

I got the answer in the pybullet section. Thank you.