chainer / chainerrl

ChainerRL is a deep reinforcement learning library built on top of Chainer.
MIT License
1.18k stars 224 forks source link

Unified PCL #109

Open ethancaballero opened 7 years ago

ethancaballero commented 7 years ago

See section 5.1 for new more performant update to PCL: https://arxiv.org/pdf/1702.08892.pdf

lyx-x commented 6 years ago

The unified version is already supported by the current implementation (at least in theory). The idea is to use a SharedModel for Q value and 2 heads (without trainable parameters) for pi and V using the formula in the paper.

Here is a quick implementation: https://github.com/lyx-x/chainerrl/blob/ab6cb4f9ff1dd419573d8fa3fc8c05840548d74d/examples/gym/train_pcl_gym.py#L155

I believe we can close this issue.