Open naeioi opened 4 years ago
What if in the vectorized environments, say we have 10 parallel environment, we assign each environment with a different task?
All environments share a single policy? @ahtsan
Yes
@ahtsan That will be the ideal implemetation of this MAML sampler.
I found that ProMP does sampling much faster than garage. This is because ProMP has a specialized sampler, call MAML Sampler, that parallelizes sampling at task-level. I think this is also important for garage.
A MAML Sampler is a sampler that samples all tasks in one run (i.e. one call to
sampler.obtain_samples()
. This is contrary to the current design of sampler, which handles a single task once at a time. MAML sampler has control for task-level scheduling, so it allows parallelism at task-level.Under MAML sampler, the training loop will be similar to something like
while currently, a MAML training loop has to switch task outside of sampler. Although sampler does parallel sampling at rollout-level, this has a higher overhead than the above MAML sampler.