In train.py, I see a central agent,SL agent and RL agents. They are running in different CPU cores with multiprocessing package. And RL agents get the weights of policy and value network from central agent with a Queue. I see train_a3c.py is very similar to train.py. I wonder if these two files are both implementations of A3C algorithm?
In train.py, I see a central agent,SL agent and RL agents. They are running in different CPU cores with multiprocessing package. And RL agents get the weights of policy and value network from central agent with a Queue. I see train_a3c.py is very similar to train.py. I wonder if these two files are both implementations of A3C algorithm?