Refactoring for enabling efficient Async-RL and reducing module dependencies

Dear all,

I'm doing a refactoring on the arena project "multiprocessing" branch, there will be following changes:

Enabling multiprocessing based Async-RL I have written a class arena.experiment.Experiment that allows Asynchronous execution of parallel actor-learners, the file 'refector_test.py' shows the usage, and seems like it works for a simple test case.

For a experiment with n parallel leaners, there will be basically n 'Actuator' child-processes for environment interaction and n 'Agent' python threads for updating the shared parameters beside the main thread. The game states are communicated between them using multiprocessing.Pipe.

Removing dependencies on specific environments Previously, our implementation of DQN only considers Atari environment and the modules in the training files are closely coupled. However, this may not be good for maintainability and future development. Therefore, I have redesigned a class interface: "Agent", that can handle general environments and also compatible with the Async-RL model. For a new algorithm to be developed, the following interface needs to be overrided: '''python

def init(self, observation_space, action_space, shared_params, stats_rx, acts_tx, is_learning, global_t, pid=0, **kwargs) def act(self, observation) def receive_feedback(self, reward, done)

''' where is_learning and global_t are two process-shared Value objects.

The remaining work will be:

[x] Refactoring the 1-step Asynchronous DQN
[x] Implementing both discrete and continuous A3C FF/LSTM

peterzcc / Arena

Refactoring for enabling efficient Async-RL and reducing module dependencies #11