How to modify singleagent.py with a pid controller to move in xyz space?

utiasDSL / gym-pybullet-drones

PyBullet Gymnasium environments for single and multi-agent reinforcement learning of quadcopter control

https://utiasDSL.github.io/gym-pybullet-drones/

MIT License

1.18k stars 349 forks source link

How to modify singleagent.py with a pid controller to move in xyz space? #11

Closed Cakhavan closed 3 years ago

Cakhavan commented 3 years ago

I tried running singleagent.py with a 'pid' argument for the action space and 'hover' for the environment, but I am getting this AttributeError: 'HoverAviary' object has no attribute 'DRONE_MODEL'. I've been trying to modify the code, but am fairly stuck in trying to adapt this for a pid instead of one input. Any suggestions on this? Thanks!

JacopoPan commented 3 years ago

Hi @Cakhavan, thanks for pointing this out/reminding me. The error you encountered is relatively minor (I used the attribute before setting it in the superclass constructor instead of the parameter) but there is still a more structural problem in using PID control within the RL classes which I haven't fixed yet: I originally made controllers as classes taking their environment as a parameter; now, embedding one into the other, creates all sorts of circular dependencies (and that's bad).

I'm aware of it but didn't have time to re-structure the code just yet, hopefully by the end of the upcoming week 👌

Cakhavan commented 3 years ago

Awesome yeah it seems to be the case. I modified the code slightly to get some quick results:

in BaseSingleAgentAviary.py I commented out lines 86-91 and replaced it with:

self.ctrl = [DSLPIDControl(CtrlAviary(drone_model=DroneModel.CF2X, num_drones=1, initial_xyzs=None, initial_rpys=None, physics=Physics.PYB, neighbourhood_radius=10, freq=240, aggregate_phy_steps=int(240/48), gui=False, record=False, obstacles=False, user_debug_gui=Falsegym_pybullet_drones/envs/single_agent_rl/BaseSingleAgentAviary.py )) for i in range(1)]

Not sure if this helps anything, but at least the code doesnt break and it seems to be doing some learning as well.

Hope this helps in some way!

JacopoPan commented 3 years ago

Yes, just giving any instance of an aviary (using the same drone model as the desired one) would work. But I think we can agree it's ugly and wasteful😅. It's a usage I hadn't initially planned for and now requires some re-thinking of the control superclass, mostly for the sake of readability .

yuchen-x commented 3 years ago

Hi,

I noticed that you had also integrated controllers in BaeMultiagentAviary.py at line 80-83.

If I understand correctly, for example, line 81 will load two (num_drones=2) drones connecting the Pybullet engine,

80: if drone_model in [DroneModel.CF2X, DroneModel.CF2P]: 81: self.ctrl = [DSLPIDControl(CtrlAviary(drone_model=DroneModel.CF2X)) for i in range(num_drones)]

And, then after the super()init call at line 84, there will be another two drones loaded. Thus, there are three pybullet environments being created and four drones loaded in the end, correct?

I noticed the BaseControl class required only 4 arguments from the env:

DRONE_MODEL GRAVITY KF KM

Then, I guess I can modify DSLPIDControl and BaseControl class to receive the above four arguments rather than an env, and then create controllers inside of BaseAviary Class according to the number of drones, right? I am not sure if this is gonna lead to any potential issue, so I'd like to double-check with you.

Thanks!

JacopoPan commented 3 years ago

@yuchen-x yes, the whole point is that creating those extra environments is computationally wasteful. The solution is simply, as you say, to re-implement the __init__() methods to take the appropriate parameters (it's 4 in the base class but a few more in the non-abstract subclasses) instead of reading them from an environment. This week has been a bit busy so far but that's the same fix I intend to push by the end of it.

JacopoPan commented 3 years ago

I have modified how control classes are initialized, so that only a drone model (and not an environment would be required) https://github.com/utiasDSL/gym-pybullet-drones/blob/bf173d0e87f26ed197fdf7e277730fd189d58f26/gym_pybullet_drones/control/BaseControl.py#L21

shangjiayong commented 1 year ago

Excuse me, @JacopoPan, @Cakhavan. I'm a little confused. For now, does "execute singleagent.py with actiontype: pid, env: hoveraviary" mean "use pid to control a single drone for a hover task"? Thanks for your time.

JacopoPan commented 1 year ago

@shangjiayong yes, a PID position (relative to that of the drone in world frame) controller

shangjiayong commented 1 year ago

@JacopoPan Thank you for your answer. However, when I use PID as actiontype, the console output contains the calculated reward. It looks like the reinforcement learning method is still working. This makes me a bit confused.

JacopoPan commented 1 year ago

@shangjiayong PID action type means that the input to the aviary/environment/pybullet simulation are position references but if you use singleagent.py you are still using RL, if you want to explicitly command the drone, look at the fly.py and velocity.py scripts.