mpi run assigns multiple cpus but the same gpu

DIDSR / MCGPU

GPU-accelerated Monte Carlo x-ray transport code to simulate medical x-ray imaging devices.

56 stars 20 forks source link

mpi run assigns multiple cpus but the same gpu #5

Closed zhiyang-fu closed 1 month ago

zhiyang-fu commented 1 year ago

Hi Andreu,

I would like to run MCGPU using multiple gpus on a single machine using the following command line. However, the program ended up running on 4 cpus yet the same gpu.

mpirun -n 4 ./MC-GPU_v1.3.x MC-GPU_v1.3.in

I checked the output file, it printed the following relevant lines. The number of processors is still 1. Do you have any thoughts on what caused the problem? Thanks.

` CUDA SIMULATION IN THE GPU

-- INITIALIZATION phase: 
      >> MPI run (myId=0, numprocs=1) on processor "**" (time: 23:18:40) << 
          -- Time spent initializing the MPI world (MPI_Barrier): 0.063 s `

andreubs commented 10 months ago

Hi Zhiyang,

Did you figure out how to use the multiple GPUs at the end? The mpirun command you used works well for me when you have the multiple GPUs in the same computer. You need to input a host file with individual IPs if you want to access different computers. THe output line "on processor "**" " does not seem correct.

Best regards,

Andreu

zhiyang-fu commented 10 months ago

Hi Andreu,

Thank you so much for the reply. I was able to use multiple GPUs in the end yet with some limitations. The mistake was that I was using anaconda's mpirun (which somehow is the default). After I switched to the system's mpirun, I can use multiple GPUs except that I could not use all the GPUs of the server. It seems that the program complaints that one of the GPUs is connected to a display:

==> CUDA: GPU #0 is connected to a display and the CUDA driver would limit the kernel run time. Skipping this GPU!!

Is there a way to bypass the warning and ask the program to not skip this GPU? In fact, I do not think the server is connected to any display.

Best regards, Zhiyang