Open RyanRizzo96 opened 4 years ago
Hello,
It seems you are talking about the custom DDPG implementation of OpenAI they created for HER. To be honest, this one is quite confusing, has a lot of tricks, that's also why we rewrote HER completely.
the number of parallel rollouts per DDPG agents implemented in stable baselines?
If you explain me what it is then I can maybe give you the equivalent. I don't really get what number of parallel rollouts
mean. Is it a number of episodes, is it a number of parallel agents?
Note that the DDPG implementation in stable-baselines is the one from the original baselines (but not the custom made for HER).
Hi,
Yes I am talking about the custom DDPG implementation. In Plappert et al. (2018), 38 trajectories were generated in parallel (19 MPI processes, each generating computing gradients from 2 trajectories and aggregating).
Their code comment states:
I think that this refers to the set of trajectories simulated in parallel. Maybe the below image will help show what I mean.
In ddpg.py, the parameter
nb_rollout_steps
is an integer containing the number of rollout steps. I believe that this is the same as the parameterT
in OpenAI baselines which refers to "the time horizon for rollouts" as they put it.My question is, where is the number of parallel rollouts per DDPG agents implemented in stable baselines? In OpenAI Baselines this value is passed when initializing DDPG as
rollout_batch_size
.Any suggestions would be appreciated.