Improvement by estimating PID gains

hex-plex commented 3 years ago

Modelling such a non linear system as a linear one and using PID is fair enough for small angles, using it for motion control would make the control unstable. Why don't you make the action space as the PID gains instead as that would give a much higher stability to the system as the policy would estimate a accurate gain value for given system configuration.

It is similar in idea to LQR ( Linear Quadratic Regulator ) but here you can have non linear correlation of inputs to your Reward function in place of a cost function as seen in LQR

Terabyte17 commented 3 years ago

Yeah @hex-plex, that's actually a good idea, will start working on it after cleaning up the codebase a little bit. Meanwhile, if you want you could put in a pull request to add that. 😀

hex-plex commented 3 years ago

Sure let End sem pass and this way I can trust deploying this controller on my bot or any inverted pendulum setup.

Terabyte17 / Deep-RL-Based-Controller-for-TurtleBot

Improvement by estimating PID gains #1