Open stonea opened 1 year ago
If I put a printf
in that if
block and run our tests, sure enough I can reach that code when CHPL_GPU_MEM_STRATEGY=array_on_device
. This is enough to repro it (I'll hit it twice actually):
use GPU;
on here.gpus[0] {
var A: [1..10, 1..10] int;
foreach a in A {
assertOnGpu();
}
}
So that code isn't dead.
My suspicion is that chpl_gpu_is_device_ptr
isn't as straightforward as the name implies and really means something like: "can I access this pointer from the device" (so page-locked memory on the host side would return true).
Ultimately it ends up calling cuPointerGetAttribute
with CU_POINTER_ATTRIBUTE_MAPPED
https://github.com/chapel-lang/chapel/blob/main/runtime/src/gpu/common/cuda-shared.h#LL68C22-L68C43
Which the cuda docs say will set the return value to "1 if this pointer is in a valid address range that is mapped to a backing allocation, 0 otherwise". I'm not sure what a "backing allocation" is, I'm guessing any valid uva pointer?
At least in the following CUDA program this seems to be the behavior:
#include <iostream>
#include <cuda.h>
#define CUDA_SAFE(op) do { \
int _retval = (op); \
if(_retval) { \
const char *_errorname; \
cuGetErrorName((CUresult)_retval, &_errorname); \
printf("[[CUDA ERROR]] line: %d " #op ": %s(%i)\n",__LINE__,_errorname,_retval); \
exit(1); \
} \
} while (0)
int main(int argc, char *argv[]) {
// Initialize CUDA
CUDA_SAFE( cuInit(0));
CUcontext ctx;
CUdevice dev;
CUDA_SAFE( cuDevicePrimaryCtxRetain(&ctx, 0));
CUDA_SAFE( cuCtxPushCurrent(ctx));
CUDA_SAFE( cuCtxGetDevice(&dev));
// Malloc some page-locked memory
void* mem = malloc(0xFF);
CUdeviceptr ptr = 0;
CUDA_SAFE( cuMemHostRegister(mem, 0xFF, CU_MEMHOSTREGISTER_PORTABLE));
unsigned int res;
// And report on it
CUDA_SAFE( cuPointerGetAttribute(&res, CU_POINTER_ATTRIBUTE_MEMORY_TYPE, (CUdeviceptr)mem));
printf("VALUE IS %d\n", res);
// Malloc some non page-locked memory And report on it
void* notPageAlloced = malloc(0xFF);
CUDA_SAFE( cuPointerGetAttribute(&res, CU_POINTER_ATTRIBUTE_MEMORY_TYPE, (CUdeviceptr)notPageAlloced));
printf("VALUE IS %d\n", res);
return 0;
}
// If run prints:
// VALUE IS 1
// [[CUDA ERROR]] line: 35 cuPointerGetAttribute(&res, CU_POINTER_ATTRIBUTE_MEMORY_TYPE, (CUdeviceptr)notPageAlloced): CUDA_ERROR_INVALID_VALUE(1)
So anyway, if this is really the intent of chpl_gpu_is_device_ptr
I think we should name it something less likely to cause confusion like chpl_gpu_is_ptr_device_accessible
with comments explaining when and when something isn't "device accessible".
@e-kayrakli, does this sound right or are you aware of some deeper meaning to chpl_gpu_is_device_ptr
?
So anyway, if this is really the intent of chpl_gpu_is_device_ptr I think we should name it something less likely to cause confusion like chpl_gpu_is_ptr_device_accessible with comments explaining when and when something isn't "device accessible".
That makes sense. I am not super happy about is_device_ptr
/is_host_ptr
functions anyways. That assertion that partially led to this confusion can also be removed from there, I think.
Based off this review comment https://github.com/chapel-lang/chapel/pull/21644#discussion_r1116235661 I believe the body of this if statement is dead code:
https://github.com/chapel-lang/chapel/blob/e94f7dfe1e7efe33fd7f7cf4cf1ec620b959c2ae/runtime/src/gpu/cuda/gpu-cuda.c#L431
Note that we assert that it's a device pointer above and then check to see if its a host pointer?
We should probably just remove it.
Note this should be updated for AMD version as well.