I encountered an issue where only one thread (the reward function) was successfully running on the GPU. After some investigation, I was able to resolve the problem.
The function set_freest_gpu() is a custom-written utility designed specifically for multi-GPU systems. In my case, I’m using a remote SSH GPU server, and the function automatically identified the "freest" GPU. However, this GPU was not actually allocated to me, which caused the following error:
RuntimeError: No CUDA GPUs are available
To partially resolve this, I used the following command to manually specify the correct GPU:
export CUDA_VISIBLE_DEVICES=0 # Replace 0 with your actual GPU ID
While this command helped in some cases, it's not a perfect solution since the issue is related to GPU allocation and visibility in shared environments.
I encountered an issue where only one thread (the reward function) was successfully running on the GPU. After some investigation, I was able to resolve the problem.
The function set_freest_gpu() is a custom-written utility designed specifically for multi-GPU systems. In my case, I’m using a remote SSH GPU server, and the function automatically identified the "freest" GPU. However, this GPU was not actually allocated to me, which caused the following error:
RuntimeError: No CUDA GPUs are available
To partially resolve this, I used the following command to manually specify the correct GPU:
export CUDA_VISIBLE_DEVICES=0 # Replace 0 with your actual GPU ID
While this command helped in some cases, it's not a perfect solution since the issue is related to GPU allocation and visibility in shared environments.