utiasDSL / safe-control-gym

PyBullet CartPole and Quadrotor environments—with CasADi symbolic a priori dynamics—for learning-based control and RL
https://www.dynsyslab.org/safe-robot-learning/
MIT License
560 stars 123 forks source link

Safety filter with CBF quadrotor example #141

Open ttavoula opened 9 months ago

ttavoula commented 9 months ago

Hello!

I am trying to run the example with a CBF safety filter for a 3D quadrotor as the one that was presented in the recent ICRA 23 workshop (timestamps: 1:06:23, and 1:10:00).

Is it possible to run this example? It seems that currently, only cartpole example is available (as mentioned back in #128).

Thanks!

adamhall commented 9 months ago

Hi @ttavoula!

Thanks for your interest in the repo and sorry for the delay. I was out hiking all last week. So, in that demo, its actually the RL safety layer (see figure 7 from the associated paper). The CBF still only works for the cartpole system, but we are open to contributions to get it running for other systems if you want to take a stab at it. There are some loose plans to extend the CBF to the quadrotor, but we don't have a timeline for that yet.

I do believe the Model Predictive Saftey filter works with the 2D quadrotor, but not the 3D quadrotor yet. MPSF with 3D quadrotor might come eventually as some people in the lab are working on that.

Sorry for any confusion!

Federico-PizarroBejarano commented 9 months ago

I do have a MPSF for the 3D quad running on my personal fork (https://github.com/Federico-PizarroBejarano/safe-control-gym/blob/smooth_mpsc_paper/safe_control_gym/safety_filters/mpsc/nl_mpsc.py) if you are interested!

ttavoula commented 9 months ago

Thank you all for the response.

Glad to see that there is an MPSF example for the 3D quadrotor, thanks @Federico-PizarroBejarano for sharing this!

Federico-PizarroBejarano commented 9 months ago

No worries, let me know if you need any help with it. I plan on merging a lot of that code into this main repo eventually, but I haven't gotten around to it yet

ttavoula commented 8 months ago

Hi @Federico-PizarroBejarano! I was able to run the 3D quadrotor MPSF experiment (mpsc_experiment.py) for tracking an 8-shaped trajectory with safety corrections using LQR.

How can I run this experiment using a different trajectory? Thanks!

Federico-PizarroBejarano commented 8 months ago

Hi @ttavoula, happy to see you have the code running! You can change the trajectory in the configuration file: https://github.com/Federico-PizarroBejarano/safe-control-gym/blob/f4d8d738ef45147cd8bcb11ea386bda91e3f03e7/experiments/mpsc/config_overrides/quadrotor_3D/quadrotor_3D_track.yaml#L78.

In the gym there are a few default trajectories, like figure8, circle, and square, but you can also set your own custom trajectory as seen here: https://github.com/Federico-PizarroBejarano/safe-control-gym/blob/f4d8d738ef45147cd8bcb11ea386bda91e3f03e7/examples/pid/pid_experiment.py#L50

Hope that helps, let me know if you run into any problems!

ttavoula commented 8 months ago

Thanks @Federico-PizarroBejarano! Yes, I'm trying to use a custom trajectory. I will let you know!

Federico-PizarroBejarano commented 8 months ago

No worries, let me know!

ttavoula commented 8 months ago

Hello @Federico-PizarroBejarano! I have a couple of questions going back to the 3D quadrotor NL MPSF to fully understand the correction/update process in the script.

Given the idea that a learning controller, if not safe, can be updated using a certified action, where does the update/certification process of the robust NL MPC happen in nl_mpc.py?

How can I see an example of a curent_state w/ an uncertified_action that gets updated to a new_state with a certified_action in this case?

Thank you!

Federico-PizarroBejarano commented 8 months ago

Hello @ttavoula, nl_mpsc.py extends mpsc.py which is where the certification step is (https://github.com/Federico-PizarroBejarano/safe-control-gym/blob/f4d8d738ef45147cd8bcb11ea386bda91e3f03e7/safe_control_gym/safety_filters/mpsc/mpsc.py#L197). The actual step-by-step execution of the whole loop is done in the experiment class (take a look at this function https://github.com/Federico-PizarroBejarano/safe-control-gym/blob/f4d8d738ef45147cd8bcb11ea386bda91e3f03e7/safe_control_gym/experiments/base_experiment.py#L113). There the uncertified controller proposes an action, sends the current state and proposed action to the safety filter, and gets back a certified action which it feeds into the system.

I would be happy to meet to explain in more detail if you wish, I know parsing through someone else's code is a nightmare. Let me know!

ttavoula commented 8 months ago

Thank you so much @Federico-PizarroBejarano, that would be great. I just emailed you.