Code: https://github.com/zzyunzhi/asynch-mb
This is an incremental paper which introduced asynchronous data collecting and policy improvement, or so called "interleaving" technique.
Problem:
State-of-the-art algorithms are now able to match the asymptotic performance of model-free methods while being significantly more data efficient. However, this success has come at a price: state-of-the-art model-based
methods require significant computation interleaved with data collection, resulting
in run times that take days
Innovation/Contribution:
In this work, we propose an asynchronous framework for model-based
reinforcement learning methods that brings down the run time of these algorithms
to be just the data collection time
We characterized the key traits of asynchronous training that improves sample
efficiency: policy regularization by interleaving policy learning and model learning, and better data
collection by interleaving policy learning and data collection
Link: semanticsholar
Code: https://github.com/zzyunzhi/asynch-mb This is an incremental paper which introduced asynchronous data collecting and policy improvement, or so called "interleaving" technique.
Problem:
Innovation/Contribution: