didclab / RL-Optimizer

The RL optimization work by Jamil, Elvis, and Jacob in DIDCLAB
0 stars 2 forks source link

Parallel Training #12

Open elrodrigues opened 11 months ago

elrodrigues commented 11 months ago

Extend trainer/runner/environment from #11 for parallel training.

elrodrigues commented 11 months ago

On second thought, parallel training may be achieved without #11 by wrapping our current environment in a 'pool' wrapper.

This pool would have a manager or a cron job based on time or episodes that will periodically soft-sync models to a 'master' model to rapidly accumulate experience, assuming all jobs are normalized identically. This master model would then be distributed to env-threads in this pool for further (distributed) training.