nrontsis / PILCO

Bayesian Reinforcement Learning in Tensorflow
MIT License
311 stars 84 forks source link

Third output of function 'predict_given_factorizations' in mgpr.py #24

Closed jastfkjg closed 5 years ago

jastfkjg commented 5 years ago

Hi!

I try to read the code, but I can't understand what the third output from predict_given_factorizations in mgpr.py.

It's noted as inv(s) * input-ouputcovariance, but can you give more explanation about that and why do we need that.

Thanks!

kyr-pol commented 5 years ago

Hi @jastfkjg,

For a lot of the core functions we are following closely the matlab implementation (found here) so the references to the original research are directly applicable. The most comprehensive is Efficient Reinforcement Learning using Gaussian Processes, the PhD thesis by Marc Deisenroth.

The input - output covariance, cov[x, f(x) | m, S], refers to the covariance between an input to the GP model, x, and the model's output f(x). It's needed for long term predictions, and it's used in our pilco.propagate function, and the corresponding Matlab function ( pred.m and propagate.m). If you want to see the math in more detail, see paragraph 2.3.3 of the thesis I linked above, with the final equation being 2.70.

jastfkjg commented 5 years ago

I will try to figure it out, Thanks a lot.