nrontsis / PILCO

Bayesian Reinforcement Learning in Tensorflow
MIT License
313 stars 84 forks source link

PILCO for time delay #29

Closed iljastas closed 4 years ago

iljastas commented 5 years ago

Hi, I'm thinking to use PILCO for a system with a significant time delay (called dead time in German). Do you know any publications for Gaussian Processes with such a system? Do you think PILCO is able to control a highly nonlinear system with a dozen of inputs and a relevant time delay? Thanks, iljastas

kyr-pol commented 5 years ago

Hi @iljastas, I don't remember anything of the sort form the top of my mind, if I come by something I'll link it.

I think it should be doable, or at least worth trying, as long as your system has reasonably smooth dynamics: no contact dynamics (robotic legs bumping on the ground for example) etc. Now it really depends on how precise you need the controller to be, how unstable the system is, and so on.

The swimmer example we have here has a 9 dimensional state space, along with 2 dimensional controls, giving 11 dimensions total, and PILCO gets a decent controller without too many data points. Make sure to check out the Jupyter notebook in the examples, it has a list of troubleshooting advice and various small workarounds we used on the examples.

iljastas commented 5 years ago

Thanks! Maybe I will try it with a Smith-Predictor