Learning of different routes

Hello, thank you for making this repository public.

I have started the training of the PPO agent with 4 processes running in parallel, and I wanted to know how the models stored under each process learn on paths from other processes. And how the experience learned on one path is shared with the other processes. I would greatly appreciate your help, thanks!

Sorry, I'm not quite clear on your question. The model under each sub-process is saved in lines 121-123 in train.py. In this work, the local model parameters are the same after each update. So, you only need to save the model parameters under process 0. The local models under different sub-processes share their experience through uploading the parameter gradients to the chief process (lines 102 in train.py) and after receiving these gradients, the global model in chief.py sums up these gradients and updates the global parameters. After that, the local model under each sub-process updates the parameter values from the global model (line 110 in train.py).

BIT-MCS / Cadre

Learning of different routes #6