Improving PILCO with Bayesian Neural Network Dynamics Models

zuoxingdong commented 6 years ago

Work to Replicate

Gal, Y., McAllister, R. and Rasmussen, C.E., 2016, April. Improving PILCO with Bayesian neural network dynamics models. In Data-Efficient Machine Learning workshop, ICML.

Motivation

This paper extends a very sample efficient model-based policy search method, PILCO, with Bayesian Neural Network Dynamics model rather than Gaussian Processes.

I have an initial trial in this repo, it fails to learn a good controller, even though I have tried a few months for testing good hyperparameters.

Challenges

If anyone interested in reproducing this algorithm can firstly have a look of my initial implementation in PyTorch.

I failed to make it work, perhaps with following potential problems:

Sensitive to specific good hyperparameters ?
BNN in this paper uses Monte Carlo dropout, maybe other BNN can work ?
The dynamics model must be trained sufficiently good for each iteration ?

rougier commented 6 years ago

Do you have access to the original code (did you try to contact the authors) ?

zuoxingdong commented 6 years ago

@rougier Yes, I have tried to contact the author around August for some helps of technical details. For now, the original code is not accessible.

rougier commented 6 years ago

Did you mean that they answered you but told you that the code is not available?

zuoxingdong commented 6 years ago

@rougier Yes, the author was kindly answering my questions about how the gradients are backpropagated through the chain of calling policy and dynamics networks. But their original code is not available.

rougier commented 6 years ago

You know you can submit a failed replication if you want. If you choose to do so, during the review, we'll try to contact the author such that they can look at your code and see if you've made an error.

ReScience / call-for-replication