utiasDSL / safe-control-gym

PyBullet CartPole and Quadrotor environments—with CasADi symbolic a priori dynamics—for learning-based control and RL
https://www.dynsyslab.org/safe-robot-learning/
MIT License
587 stars 122 forks source link

RL Fly To A Point #158

Open zcase opened 1 month ago

zcase commented 1 month ago

I am new to this repo, but have used the gym_pybullet_drones repo in the past. Switching to this because it seemed like through discussions this repo was setup more to potentially go from sim to reality. My question is what would be the best way to start training an RL algorithm using the repo to fly to a point using the existing PPO algorithm?

Seems like a lot of people used different branches of this repo so not sure if the main branch is best or another branch and which scripts are needed or if I need to create my own.

adamhall commented 3 weeks ago

Hey @zcase,

There is an example of this in the repo. In rl_experiments.sh, if you comment out TASK='track' and uncomment TASK='stab', the quadrotor will stabilize to the point (0,0,1,0,0,0) (hovering at x=0, y=0, z=1.0). You could also try it to stabilize to other points by changing the stabilization goal, and then retraining the agent.

I thought that the agent could stabilize to multiple points, so I tried changing task_config.task_info.stabilization_goal = [1.0, 1.0, 1.0] in quadrotor_3D_stab.yaml using the already trained agent and it still stabilizes to [0, 0, 1]. I'm not sure if this is due to how it was trained or a different issue in the repo. Any thoughts @Federico-PizarroBejarano?

zcase commented 3 weeks ago

@adamhall So off the main branch when I do what you mentioned, I get a KeyError for stab and it doesn't work. Im currently trying to chase down why though.

It looks like its not under the registery or is missing from it. I try it with the track task and that doesn't work as well.

zcase commented 3 weeks ago

@adamhall so it looks like task has to be set to quadrotor for it to run and not through an error. However it still end abruptly.

adamhall commented 3 weeks ago

That's very strange because it works perfectly for me off the main branch. Can you send me:

  1. The system you are on
  2. the python version you are using, and a list of the libraries and their version you have installed (use can use pip list),
  3. the exact command you run (and the directory you are running it from) and the script you are running (if you are running it via a script).
  4. A printout of your error. Maybe there is something funky going on.
zcase commented 3 weeks ago

@adamhall So the error was user error. I was putting stab for the task rather than quadrotor. However, Just like my other issue I run into the issue when trying to view with the gui that the pybulllet env will start up and immediately close as if the program has finished. It is to fast to even see if the drone reached its goal.

With these being the same issue and you pointed out that to train and RL to fly to a point we can use the stab config then I think we can close this and just figure out in the other one why the gui isn't staying open to visually see the drone do a track or stabilize at a point. Thoughts?

Also thank you very much for your help!

zcase commented 3 weeks ago

@adamhall or @Federico-PizarroBejarano: Following off of what Adam said above:

There is an example of this in the repo. In rl_experiments.sh, if you comment out TASK='track' and uncomment TASK='stab', the quadrotor will stabilize to the point (0,0,1,0,0,0) (hovering at x=0, y=0, z=1.0). You could also try it to stabilize to other points by changing the stabilization goal, and then retraining the agent.

I thought that the agent could stabilize to multiple points, so I tried changing task_config.task_info.stabilization_goal = [1.0, 1.0, 1.0] in quadrotor_3D_stab.yaml using the already trained agent and it still stabilizes to [0, 0, 1]. I'm not sure if this is due to how it was trained or a different issue in the repo. Any thoughts @Federico-PizarroBejarano?

Is there a way to add with the config the area of randomization to the stabilization? I.e the initial stabilzation_goal within the config could be the fist point to get to an once that has been achieved additional config parameters could kick in to randomize a new goal position/orientation based on constriants until it was reached and then would change again? This might help to be able to create a crazyflie model that can be trained on a model to fly to any point or any point within an area.

Thoughts?

adamhall commented 4 days ago

@zcase, sorry I was away for a bit! Some responses below:

@adamhall So the error was user error. I was putting stab for the task rather than quadrotor. However, Just like my other issue I run into the issue when trying to view with the gui that the pybulllet env will start up and immediately close as if the program has finished. It is to fast to even see if the drone reached its goal.

If you send me the items referenced here I can better understand your issue.

Is there a way to add with the config the area of randomization to the stabilization? I.e the initial stabilzation_goal within the config could be the fist point to get to an once that has been achieved additional config parameters could kick in to randomize a new goal position/orientation based on constriants until it was reached and then would change again? This might help to be able to create a crazyflie model that can be trained on a model to fly to any point or any point within an area.

Thoughts?

We don't currently have the gym configured this way, however, if you know the points you wish to go to, you could generate a trajectory (using lines or fitting a spline or something) and then use that trajectory in a trajectory tracking setup. I am realizing now, however, that I don't have the ability to follow custom trajectories in the current main branch. I'll look into putting this in. You can play around with _generate_trajectory to try and get something like this working.