Does managedCuda support multiple GPUs?

kunzmi / managedCuda

ManagedCUDA aims an easy integration of NVidia's CUDA in .net applications written in C#, Visual Basic or any other .net language.

Other

440 stars 79 forks source link

Does managedCuda support multiple GPUs? #44

Closed dennisai closed 7 years ago

dennisai commented 7 years ago

If so, is there an example that I can look at? Specifically, I am wondering how one would choose a GPU to allocate a new CudaDeviceVariable to?

RobbinMarcus commented 7 years ago

You can create a cuda context for a specific device. For example you can get the device with the most Gflops like this:

var cudaContext = new CudaContext(CudaContext.GetMaxGflopsDeviceId());

dennisai commented 7 years ago

Yes I understand that, but how does that allow me to choose which GPU to allocate a CudaDeviceVariable to? For example, in my code, I might have:

private CudaDeviceVariable<float> d = new CudaDeviceVariable<float>(1);

Nothing in that line, or in the source code, seems to let refer to a CudaContext. Do I need to issue CudaContext.SetCurrent() to specify the current device, and all subsequent CudaDeviceVariable allocations will be allocated using the current device?

RobbinMarcus commented 7 years ago

Yes, even though I can't find it in the source of ManagedCuda with a quick look, I know from the standard CUDA C++ library that all the CUDA calls will be issued to the current GPU. The only exception is a call like PeerCopyToDevice, in which you have to specify both contexts yourself. This allows you to copy from one GPU to another.

Be careful with threading in this case, async calls will also use the current GPU.

kunzmi commented 7 years ago

This would do it:

var gpu0 = new CudaContext(0);
//gpu0 is now the current context bound to the calling host/CPU thread
CudaDeviceVariable<float> var1_onGpu0 = new CudaDeviceVariable<float>(123);

var gpu1 = new CudaContext(1);
//gpu1 is now current: gpu0 is now "floating" (unbound), hence you can't access var1_onGpu0 from host    
CudaDeviceVariable<float> var2_onGpu1 = new CudaDeviceVariable<float>(123);

//set gpu0 current again:
gpu0.SetCurrent();
CudaDeviceVariable<float> var2_onGpu0 = new CudaDeviceVariable<float>(1234);

You can of course also create a distinct CPU-thread for each device as each CudaContext is bound to one host thread. This avoids the switching of the contexts in your code...

solarflarefx commented 3 years ago

@kunzmi It looks like the latest NuGet package no longer has GetMaxGflopsDeviceId() method. Is there an alternative way to achieve the same result?

kunzmi commented 3 years ago

Hi, not a direct one. The problem is that this method was taken from the CUDA samples and this function actually doesn't work in a future safe way, i.e. if you take an older managedcuda version with todays GPUs you wouldn't get the right results. The CUDA samples still have it, have a look at C:\ProgramData\NVIDIA Corporation\CUDA Samples\v11.2\common\inc\helper_cuda_drvapi.h. It basically just does some checks on the device properties and you can easily implement a similar function using CudaDeviceProperties. But again, it will fail in a few years with new GPU generations...