Closed ZhuFengdaaa closed 3 years ago
You could also try to distribute the simulators over different GPUs by changing the GPU_DEVICE_ID
for each process here. With two 12GB GPUs (one for simulator threads and another for torch), you should be able to train at least 12 workers (NUM_PROCESSES=12
) in parallel.
This suggestion is helpful! I never thought that the simulator could be distributed to different GPUs. Thank you!
I face the CUDA out of memory error when I run
The error log is like:
I reduced the
NUM_PROCESSES=4
but it not works. And mynvidia-smi
is like:How much cuda memory is required? Thanks in advance for the help.
A Temporal Solution:
I tried set the
TORCH_GPU_ID=1
to make the network forwards on a difference GPU device. It looks fine now ;)