The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org.
There is an Issue on the OMPToolsInterface project running the totalview debugger with omp GPU targets. See the program tx_omp_target_generic.c.
See the screenshot initialize.png where openmp/libompd/src/omp-debug.cpp line 146 is failing.
This code used to look like ( and did not fail )
ret = TValue(process_handle->context, "ompd_CudaContextArray").
cast("ompd_cuda_context_ptr_t",1).
getArrayElement(i).
castBase(ompd_type_long_long).
getValue(cuda_ctx);
but now looks like
ret = TValue(process_handle->context, "ompd_CudaDeviceDataArray").
cast("DeviceDataTy",1).
getArrayElement(i).
access("Context").
castBase(ompd_type_long_long).
getValue(cuda_ctx);
and fails.
It is failing inside the getArrayElement() call which calls dereference().
I attached some screen shots. I am debugging the program in screen shot “Pragma omp target.png” at line 15. The system has a single Cuda device.
The screen shots are of the total view debugger as it processes this line.
Initialize.png
Here the TValue for ompd_cudaDeviceDataArray is being called. In the cascaded call sequence, getArrayElement(i) is failing . Note that we have one Cuda device on this system so i=0 in the loop of 1.
Line 153 detects the failure and never hits line 156. Note that the actual failure is occurring in the getValue(cuda_ctx) call, although the ultimate cause is because of earlier undetected failures in getArrayElement().
GetArrayElement.png
Here deference() is failing but the return code is not being checked. Later on getSize on line 456 is called and returns success, so getArrayElement() ends up returning ompd_rc_ok. However, note that
the ret structure has symbolAddr fields set to zero, which is not correct.
Screen Shot Dereference.png
Here line 329 has set errorCode = ompd_rc_unsupported. This is because the call at 322 returned a ret.symbolAddr.address = 0. This was due to tmpAddr.address = 0 due to the results of the call at line 316. See ScreenShot Der2.png for the value of tmpAddress at line 322 .
ScreenShot Der2.png shows that this.symbolAddr.address was 0x…4280 which resulted in tmpAddress in #3 above getting a zero. Apparently the call at line 316 had a bad symbolAddr.
On a side note: TValue Constructor.png
I found this constructor odd since it did not record the typeName string in the object; in this case ompd_cudaDeviceDataArray. See screen shot. The typeName ended up being 0x0 after construction even though the
symbolAddr.segment and symbolAddr.address structure fields were set to the values for ompd_CudaDeviceDataArray.
Larry Burley
The test program is
// Testing generic mode of nvptx devRtl
There is an Issue on the OMPToolsInterface project running the totalview debugger with omp GPU targets. See the program tx_omp_target_generic.c. See the screenshot initialize.png where openmp/libompd/src/omp-debug.cpp line 146 is failing.
This code used to look like ( and did not fail )
but now looks like ret = TValue(process_handle->context, "ompd_CudaDeviceDataArray"). cast("DeviceDataTy",1). getArrayElement(i). access("Context"). castBase(ompd_type_long_long). getValue(cuda_ctx); and fails.
It is failing inside the getArrayElement() call which calls dereference().
The new code was pulled around Oct 11.. git clone https://github.com/OpenMPToolsInterface/llvm-project.git cd llvm-project git checkout ompd-tests
I attached some screen shots. I am debugging the program in screen shot “Pragma omp target.png” at line 15. The system has a single Cuda device. The screen shots are of the total view debugger as it processes this line.
On a side note: TValue Constructor.png I found this constructor odd since it did not record the typeName string in the object; in this case ompd_cudaDeviceDataArray. See screen shot. The typeName ended up being 0x0 after construction even though the symbolAddr.segment and symbolAddr.address structure fields were set to the values for ompd_CudaDeviceDataArray.
Larry Burley
The test program is // Testing generic mode of nvptx devRtl
include
pragma omp declare target
void test_breakpoint() { asm(""); }
pragma omp end declare target
void vec_mult(int N) { int i; float p[N], v1[N], v2[N]; //init(v1, v2, N);
pragma omp target map(v1, v2, p)
{ test_breakpoint();
pragma omp parallel for
for (i=0; i<N; i++) { test_breakpoint(); p[i] = v1[i] * v2[i]; } test_breakpoint(); } //output(p, N); } int main() { printf("calling vec_mul...\n"); vec_mult(64); printf("done\n"); return 0; }