pmodels / oshmpi

OpenSHMEM Implementation on MPI
https://pmodels.github.io/oshmpi-www/
Other
25 stars 14 forks source link

GPU: Intel GPU(Ze) memory kind in space_create #103

Closed minsii closed 3 years ago

minsii commented 3 years ago
raffenet commented 3 years ago

One difference in the Level Zero API is that there is an explicit device handle argument to the memory allocator. I think we can handle this initially by just getting a handle (and context) at init time and using the same one for all allocations. How does that sound @minsii? We'll need to add an init function like we have in MPL to gather device information.

https://spec.oneapi.com/level-zero/latest/core/api.html#zememallocdevice

pavanbalaji commented 3 years ago

FYI, CUDA needs you to set the device before doing memory allocation too (even though the function itself doesn't take the device argument). So they are equivalent (except for one function call vs. two function calls).

raffenet commented 3 years ago

Oh, I see. So in the tests, the user sets the CUDA device before creating the space. I suppose we can extend shmemx_space_config_t to take additional device information for this purpose, like we do with MPL attributes. Still need to modify the internal allocation APIs, but that shouldn't be too bad.

minsii commented 3 years ago

@raffenet here is an example for the user program of oshmpi+space+cuda memkind.

We assume the user sets the cuda device, so that shmemx_space_create internally calls cudaMalloc to allocate buffer on this device.

minsii commented 3 years ago

Extending shmemx_space_config_t sounds good to me. We want to minimize the device-specific task handled by OSHMPI - user sets device, OSHMPI only allocates the GPU buffer and passes down to MPI

minsii commented 3 years ago

Test added via #107