Closed eito-fis closed 5 years ago
Hi @eito-fis
Thanks for bringing this to our attention. We will take a look at it, and try to get back to you. This seems to be an macOS specific issue, and may be due to some kind of power management on that platform.
I've noticed this as well. I wonder if changing the run priority (e.g. https://superuser.com/questions/42817/is-there-any-way-to-set-the-priority-of-a-process-in-mac-os-x) will alleviate the problem. This seems to specifically be an optimization for when you can't see the window.
I notice this when launching env on virtual GL as well. about 50% decrease on throughput over 48 environments.
v2.1 seems to solve this issue.
Thanks!
Hi, I have a quite weird problem and haven't had much success in debugging it.
I'm currently using a parallel env wrapper almost identical to that of stable-baselines. The first few synchronous rollouts on 4 environments run fine at ~ 40 steps per second, but it seems to randomly drop to ~2 steps per second. During this time, CPU usage drops to very little while GPU usage stays the same, and debugging has shown that my parallel env is simply waiting for the obstacle tower env to return a new step.
The strange part is that this slow down is immediately solved by alt-tabbing to or clicking on all the unity executables. Afterward, a few more full rollouts run before again slowing down.
Also relevant is that fact that I don't have this issue running on a debian GCP vm. Whether this is because it's debain or headless is unclear.
Before I have to sit down and write some auto alt-tabbing script to train, has anyone seen a similar problem or have some guidance on what to do? I'm on Python 3.6.3, osx v10.13.6, and Obstacle Tower Env v1.3.