ROCm / HIPIFY

HIPIFY: Convert CUDA to Portable C++ Code
https://rocm.docs.amd.com/projects/HIPIFY/en/latest/
MIT License
511 stars 72 forks source link

[HIPIFY] Errors when processing the NVIDIA sample "simpleSurfaceWrite" #140

Closed wuren2020 closed 2 years ago

wuren2020 commented 4 years ago

Ran hipify-clang /path/to/NVIDIA_CUDA-10.2_Samples/0_Simple/simpleSurfaceWrite/simpleSurfaceWrite.cu --skip-excluded-preprocessor-conditional-blocks -- -I/path/to/NVIDIA_CUDA-10.2_Samples/common/inc and got error:

warning: Unknown CUDA version 10.2. Assuming the latest supported version 10.1 [-Wunknown-cuda-version]
/tmp/simpleSurfaceWrite.cu-8cd6ee.hip:226:21: warning: CUDA identifier is deprecated.
    checkCudaErrors(cudaMemcpyToArray(cuArray,
                    ^
/tmp/simpleSurfaceWrite.cu-8cd6ee.hip:68:5: error: use of undeclared identifier 'surf2Dwrite'
    surf2Dwrite(gIData[y * width + x],
    ^
In file included from <built-in>:1:
In file included from /usr/lib/llvm-10/lib/clang/10.0.1/include/__clang_cuda_runtime_wrapper.h:324:
/usr/local/cuda-10.2/include/texture_indirect_functions.h:145:4: error: use of undeclared identifier '__nv_tex_surf_handler'
   __nv_tex_surf_handler("__itex2D", ptr, obj, x, y);
   ^
/usr/local/cuda-10.2/include/texture_indirect_functions.h:154:3: note: in instantiation of function template specialization 'tex2D<float>' requested here
  tex2D(&ret, texObject, x, y);
  ^
/tmp/simpleSurfaceWrite.cu-8cd6ee.hip:96:29: note: in instantiation of function template specialization 'tex2D<float>' requested here
    gOData[y * width + x] = tex2D<float>(tex, tu, tv);
                            ^
1 warning and 2 errors generated when compiling for host.
Error while processing /tmp/simpleSurfaceWrite.cu-8cd6ee.hip.
emankov commented 3 years ago

The issue is because __nv_tex_surf_handler built-in is still not implemented in clang. Here is the latest activity on the subject: https://reviews.llvm.org/D76365. Mark this bug as clang specific bug.

emankov commented 3 years ago

Duplicate of #134

emankov commented 2 years ago

The original bug: https://bugs.llvm.org/show_bug.cgi?id=26400, which was finally fixed in clang only in Sept 2021 by https://reviews.llvm.org/D110089.

hipify-clang should be built against the latest trunk LLVM (currently, 14.0.0git).

The command line to test:

hipify-clang --print-stats --cuda-path="c:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.5" "c:\ProgramData\NVIDIA Corporation\CUDA Samples\v11.5/0_Simple/simpleSurfaceWrite/simpleSurfaceWrite.cu" --skip-excluded-preprocessor-conditional-blocks -I"c:\ProgramData\NVIDIA Corporation\CUDA Samples\v11.5\common\inc" -- -D_ALLOW_COMPILER_AND_STL_VERSION_MISMATCH

Output:

C:\Users\TT\AppData\Local\Temp\simpleSurfaceWrite.cu-ee3765.hip:228:21: warning: CUDA identifier is deprecated.
    checkCudaErrors(cudaMemcpyToArray(cuArray,
                    ^
1 warning generated when compiling for host.

[HIPIFY] info: file 'c:\ProgramData\NVIDIA Corporation\CUDA Samples\v11.5/0_Simple/simpleSurfaceWrite/simpleSurfaceWrite.cu' statistics:
  CONVERTED refs count: 48
  UNCONVERTED refs count: 0
  CONVERSION %: 100.0
  REPLACED bytes: 981
  TOTAL bytes: 11425
  CHANGED lines of code: 41
  TOTAL lines of code: 318
  CODE CHANGED (in bytes) %: 8.6
  CODE CHANGED (in lines) %: 12.9
  TIME ELAPSED s: 12.11
[HIPIFY] info: CONVERTED refs by type:
  device: 3
  memory: 7
  texture: 3
  surface: 2
  device_function: 4
  include_cuda_main_header: 1
  type: 13
  numeric_literal: 11
  define: 1
  kernel_launch: 3
[HIPIFY] info: CONVERTED refs by API:
  CUDA RT API: 48
[HIPIFY] info: CONVERTED refs by names:
  cosf: 2
  cudaAddressModeWrap: 2
  cudaArray: 1
  cudaArraySurfaceLoadStore: 1
  cudaBoundaryModeTrap: 1
  cudaChannelFormatDesc: 1
  cudaChannelFormatKindFloat: 1
  cudaCreateChannelDesc: 1
  cudaCreateSurfaceObject: 1
  cudaCreateTextureObject: 1
  cudaDestroySurfaceObject: 1
  cudaDestroyTextureObject: 1
  cudaDeviceProp: 1
  cudaDeviceSynchronize: 2
  cudaFilterModeLinear: 1
  cudaFree: 1
  cudaFreeArray: 1
  cudaGetDeviceProperties: 1
  cudaLaunchKernel: 3
  cudaMalloc: 1
  cudaMallocArray: 1
  cudaMemcpy: 2
  cudaMemcpyDeviceToHost: 1
  cudaMemcpyHostToDevice: 2
  cudaMemcpyToArray: 1
  cudaReadModeElementType: 1
  cudaResourceDesc: 4
  cudaResourceTypeArray: 2
  cudaSurfaceObject_t: 2
  cudaTextureDesc: 2
  cudaTextureObject_t: 2
  cuda_runtime.h: 1
  sinf: 2