denisyarats / drq

DrQ: Data regularized Q
https://sites.google.com/view/data-regularized-q
MIT License
407 stars 52 forks source link

How to limit number of threads spawned? #4

Closed snailrowen1337 closed 4 years ago

snailrowen1337 commented 4 years ago

I want to run DRQ on a larger node with multiple GPUs and 18 cores (36 with hyperthreading). When I try to run multiple DRQ jobs in parallel on the node, each job seems to spawn 41 threads, and this seems to be too much to handle for the CPU. Is there any way to limit the number of threads that DRQ launches? Thanks!!

ikostrikov commented 4 years ago

Try to launch this way: OMP_NUM_THREADS=1 python ...

snailrowen1337 commented 4 years ago

Thanks, I tried this which indeed reduced the number of threads. However, the throughput is still poor when running multiple jobs on a single node. I'm guessing the jobs are just CPU-bound and more than 4 cores per GPU are needed.

denisyarats commented 4 years ago

I think this could be because of MuJoCo rendering, as our algorithm does pretty much everything on GPU. Maybe it is worth looking into it.

snailrowen1337 commented 4 years ago

Thanks! I tried running the TF2 implementation of Dreamer, and it did not slow down when running multiple jobs on the same node which makes me suspect that it's not a MuJoCo issue. If the jobs aren't CPU bound for you, it might just be something with the node I'm using.