wil3 / gymfc

A universal flight control tuning framework
http://wfk.io/neuroflight/
MIT License
389 stars 99 forks source link

is the gymfc environment suitable for RL training with PX4 firmware? #85

Closed fabrizioschiano closed 4 years ago

fabrizioschiano commented 4 years ago

Hi @wil3 , I am reading your papers and I want to congratulate you for the great work. It is really nice to see your drone being trained in simulation and then flying in a stable manner in a real-world scenario.

I have a question that might be out of scope on which I would like your opinion. I am working with gazebo and the PX4 firmware (I am sure you know about it) and I was wondering if the gymfc environment is suitable for implementing RL algorithms with PX4 in Software in the loop simulations. In principle, I understood that you do not rely on any specific characteristic of the motors but for training the NN looks at the angular velocity error e(t) at the current time step (t) and to the error difference delta_e(t) = e(t) - e(t-1). This let me understand that it might be better to start from gymfc than from scratch in order to implement a new environment, based on PX4, in which to train a RL controller. I don't see any major roadblocks for doing that. Of course then the deployment of the learned controller on a PX4-capable flight-controller is a separate story.

I hope my question is clear, otherwise let me know and I will try to clarify.

Thanks in advance.

wil3 commented 4 years ago

Hi @fabrizioschiano thanks for the support!

Ah this is excellent, I'm working on a new project using the PX4 firmware and I've been thinking about this too :-).

So in the context of neuro-control, to use PX4 you'd need to replace the mc_rate_control module with a neuro-control module. You'll notice gymfc its not specific to any firmware, you are just synthesizing the neuro-controller and you can drop that in to any firmware in replacement of the PID controller for rate control. From my neuroflight preprint and thesis you'll see that the NN synthesized with gymfc is then transferred over to Neuroflight (my Betaflight fork) to replace the PID controller. Same exact thing could be done with PX4.

Now in the context of PID tuning this is where a specific firmware would matter because you'd need a bridge to connect say PX4 to gymfc assuming you want to do a full firmware SITL tuning approach. It would actually be quite interesting if you could just boot a SITL instance of PX4 just using say the mc_rate_control module or the minimal number of modules needed for tuning. Basically gymfc in this instance would allow you to act as a 'man-in-the-middle' intercepting state and control signals which you could then use to do auto-tuning. This gets exciting because you could use optimization algorithms like genetic algorithms to derive the gains for your pid controller. This is more along the lines of what I've been lookin at lately.

Does that answer you question?

fabrizioschiano commented 4 years ago

Hi @wil3 , thanks for your prompt reply!

Your answer is exactly what I was thinking about as a workflow for what I had in mind and I posted this question because I wanted to be sure that what I thought made sense.

For now, it is just an idea and I am not sure I will have time to go through it. However, if I do, I will certainly try to share my updates with you.

It would be nice to then compare methods such as genetic algorithms with simple ones such as Ziegler-Nichols or other heuristic ones.

p.s. Your answer exactly answers my question.