Asynchronous parallel training like A3C is supported by ChainerRL, but synchronous parallel training, where multiple actors interact with their own environments in a synchronous manner, is not supported yet. It is beneficial in that they are more stable and can utilize GPU computation. It can be also used for multi-agent environments with simultaneous actions.
Since most of RL algorithms can naturally support it, we should define a common interface for it.
Asynchronous parallel training like A3C is supported by ChainerRL, but synchronous parallel training, where multiple actors interact with their own environments in a synchronous manner, is not supported yet. It is beneficial in that they are more stable and can utilize GPU computation. It can be also used for multi-agent environments with simultaneous actions.
Since most of RL algorithms can naturally support it, we should define a common interface for it.