lcswillems / torch-ac

Recurrent and multi-process PyTorch implementation of deep reinforcement Actor-Critic algorithms A2C and PPO
MIT License
190 stars 64 forks source link

Support for Off-Policy Actor-Critic Algorithms like ACER #2

Open Riashat opened 5 years ago

Riashat commented 5 years ago

Hi,

Is there any support for the off-policy counterpart of A2C (ACER algorithm) that can be made based on this repo?

This is a very useful repo that we mostly use, and also nice to have its compatibility with gym minigrid tasks. However, there is no open source implementation of ACER that can be made compatible with gym minigrid yet.

Would be nice to have ACER support coming along with this repo?

There is a useful one I found, which works for ALE tasks, but no support for gym minigrid. https://github.com/belepi93/pytorch-acer

maximecb commented 5 years ago

If it supports OpenAI Gym environments, then making it work with MiniGrid shouldn't be very difficult. You might just need to change the size of convolution layers and create a gym environment wrapper to have the input in the format you want.

lcswillems commented 5 years ago

Hi,

I don't know ACER. How does it differ from A2C?

It is unlikely I will have the it will not have time to implement this algorithm. But, maybe it can be easy for you. Do you want I give you more details on the code?

As Maxime has mentioned, it might also be easy to use for Minigrid by just doing in the repo you mention.