Single Thread (Reward Function) Running on GPU

I encountered an issue where only one thread (the reward function) was successfully running on the GPU. After some investigation, I was able to resolve the problem.

The function set_freest_gpu() is a custom-written utility designed specifically for multi-GPU systems. In my case, I’m using a remote SSH GPU server, and the function automatically identified the "freest" GPU. However, this GPU was not actually allocated to me, which caused the following error:

RuntimeError: No CUDA GPUs are available

To partially resolve this, I used the following command to manually specify the correct GPU:

export CUDA_VISIBLE_DEVICES=0 # Replace 0 with your actual GPU ID

While this command helped in some cases, it's not a perfect solution since the issue is related to GPU allocation and visibility in shared environments.

eureka-research / Eureka

Single Thread (Reward Function) Running on GPU #50