kengz / SLM-Lab

Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".
https://slm-lab.gitbook.io/slm-lab/
MIT License
1.25k stars 264 forks source link

Fix XVFB slowdown; enable concurrent Ray runs; add search specs #371

Closed kengz closed 5 years ago

kengz commented 5 years ago

Fix XVFB slowdown

When running many trials and sessions some will get slowed down with low GPU/CPU usage. This happens across GPU and CPU, with and without Ray, with new and old version of PyTorch, and even with careful garbage collection. The slowdown is to 5 FPS for cartpole and 50 FPS for Atari with venv.

After debugging, the cause seems to be XVFB wrapper for Linux. Suspicion is that the fake I/O gets overcrowded and slowed for some processes and they remain so.

Enable concurrent Ray runs

Multiple ray run is useful for running many searches/benchmarks.

More search specs