PG642 / multi-sample-factory

High throughput reinforcement learning on clusters
MIT License
2 stars 0 forks source link

Verwendung von mehreren Knoten #4

Closed KonstantinRamthun closed 3 years ago

KonstantinRamthun commented 3 years ago

Idee des sample-factory Papers implementieren:

We also want to point out that maximizing training efficiency on a single machine is equally important for distributed systems. In fact, Sample Factory can be used as a single node in a distributed setup, where each machine has a sampler and a learner. The learner computes gradients based on locally collected experience only, and learners on multiple nodes can then synchronize their parameter updates after every training iteration, akin to DD-PPO (Wijmans et al., 2020).