Open sanjay-subramanya opened 8 months ago
Hello, thank you for making this repository public.
I have started the training of the PPO agent with 4 processes running in parallel, and I wanted to know how the models stored under each process learn on paths from other processes. And how the experience learned on one path is shared with the other processes. I would greatly appreciate your help, thanks!
Sorry, I'm not quite clear on your question. The model under each sub-process is saved in lines 121-123 in train.py
. In this work, the local model parameters are the same after each update. So, you only need to save the model parameters under process 0. The local models under different sub-processes share their experience through uploading the parameter gradients to the chief process (lines 102 in train.py
) and after receiving these gradients, the global model in chief.py
sums up these gradients and updates the global parameters. After that, the local model under each sub-process updates the parameter values from the global model (line 110 in train.py
).
Hello, thank you for making this repository public.
I have started the training of the PPO agent with 4 processes running in parallel, and I wanted to know how the models stored under each process learn on paths from other processes. And how the experience learned on one path is shared with the other processes. I would greatly appreciate your help, thanks!