Closed snailrowen1337 closed 4 years ago
Try to launch this way: OMP_NUM_THREADS=1 python ...
Thanks, I tried this which indeed reduced the number of threads. However, the throughput is still poor when running multiple jobs on a single node. I'm guessing the jobs are just CPU-bound and more than 4 cores per GPU are needed.
I think this could be because of MuJoCo rendering, as our algorithm does pretty much everything on GPU. Maybe it is worth looking into it.
Thanks! I tried running the TF2 implementation of Dreamer, and it did not slow down when running multiple jobs on the same node which makes me suspect that it's not a MuJoCo issue. If the jobs aren't CPU bound for you, it might just be something with the node I'm using.
I want to run DRQ on a larger node with multiple GPUs and 18 cores (36 with hyperthreading). When I try to run multiple DRQ jobs in parallel on the node, each job seems to spawn 41 threads, and this seems to be too much to handle for the CPU. Is there any way to limit the number of threads that DRQ launches? Thanks!!