ROCm / clr

MIT License
85 stars 35 forks source link

[Issue]: missing hip/nvidia_detail/nvidia_hip_runtime.h #44

Open hpjeonGIT opened 5 months ago

hpjeonGIT commented 5 months ago

Problem Description

Hi, I am testing HIP on Intel CPU workstation with Nvidia GPU. I could build hip/clr as shown in the AMD documentation (after running dos2unix for all sources) but testing square.cu yields an error message, saying that "hip/nvidia_detail/nvidia_hip_runtime.h" is missing. Looks like there is only hip/amd_detail folder, and I am wondering where I may find nvidia_detail folder?

Operating System

RHEL8.8

CPU

intel xeon

GPU

AMD Instinct MI300X, AMD Radeon VII

ROCm Version

ROCm 6.0.0

ROCm Component

clr

Steps to Reproduce

No response

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

GPU selection above is fake - there is no option for nvidia gpu.

hpjeonGIT commented 5 months ago

The folder seems available at 5.7 (not 6.0). I am closing this issue.

ipatix commented 5 months ago

I ran into exactly the same problem. the nvidia_detail folder apparently got removed in rocm-6.0.0. This essentially broke Nvidia builds for rocm 6.0.0.

Did this get moved to somewhere else or was there an accident during development?

Edit: Okay, I checked the commit history and I found e8a52205e6517c4103ebb811c509027f2ef824d4

Apparently the Nvidia stuff has been moved to here https://github.com/ROCm/hipother. Yet this isn't documented anywhere nor does this repository have a README nor is this mentioned anywhere in the build instructions. I would really appreciate some additional info on this. I guess in the meantime I'll have to use rocm-5.7.1.

torrance commented 2 months ago

Thanks @ipatix for your comment. I've spent hours trying to debug the CMake build scripts for hip to understand why I was missing nvidia_detail.

AMD: Please fix your documentation!

pvelesko commented 2 months ago

@ipatix Thanks for the pointer. I can confirm that HIP and Nvidia GPU still works but I think hipcc/cmake needs to be updated to require the include

➜  hip-nvidia ./clr/build/install/bin/hipcc ./MatrixMul.cpp
In file included from ./MatrixMul.cpp:34:
/home/pvelesko/hip-nvidia/clr/build/install/include/hip/hip_runtime.h:64:10: fatal error: hip/nvidia_detail/nvidia_hip_runtime.h: No such file or directory
   64 | #include <hip/nvidia_detail/nvidia_hip_runtime.h>
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
➜  hip-nvidia find ./ -name "nvidia_hip_runtime.h"
➜  hip-nvidia
➜  hip-nvidia git clone https://github.com/ROCm/hipother.git
Cloning into 'hipother'...
remote: Enumerating objects: 790, done.
remote: Counting objects: 100% (790/790), done.
remote: Compressing objects: 100% (136/136), done.
remote: Total 790 (delta 225), reused 778 (delta 213), pack-reused 0
Receiving objects: 100% (790/790), 156.81 KiB | 1.41 MiB/s, done.
Resolving deltas: 100% (225/225), done.
➜  hip-nvidia
➜  hip-nvidia find ./ -name "nvidia_hip_runtime.h"
./hipother/hipnv/include/hip/nvidia_detail/nvidia_hip_runtime.h
➜  hip-nvidia find ./ -name "nvidia_hip_runtime.h"
➜  hip-nvidia ./clr/build/install/bin/hipcc ./MatrixMul.cpp -I./hipother/hipnv/include
➜  hip-nvidia ./a.out
Device name NVIDIA GeForce RTX 3070
Running 1 iterations
hipLaunchKernel 0 time taken: 12.0893
hipLaunchKernel BEST TIME: 12.0893
GPU real time taken(ms): 13.4561
matrixMultiplyCPUReference time taken(ms): 3233.36
Verification PASSED!
mredenti commented 2 weeks ago

I really do not understand why has this been moved to a separate lonely repo... AMD please fix your documentation as well

mangupta commented 2 weeks ago

I checked https://github.com/ROCm/HIP/blob/develop/docs/install/build.rst#building-the-hip-runtime. It mentions - Starting in ROCM 6.1, a new repository ``hipother`` is added to ROCm, which is branched out from HIP. ``hipother`` provides files required to support the HIP back-end implementation on some non-AMD platforms, like NVIDIA. I do see some formatting issues with the content which we will correct. Could you point to the link which you are referring to so that we can ensure that any bad copies can be removed?

mredenti commented 2 weeks ago

I checked https://github.com/ROCm/HIP/blob/develop/docs/install/build.rst#building-the-hip-runtime. It mentions - Starting in ROCM 6.1, a new repositoryhipotheris added to ROCm, which is branched out from HIP.hipotherprovides files required to support the HIP back-end implementation on some non-AMD platforms, like NVIDIA. I do see some formatting issues with the content which we will correct. Could you point to the link which you are referring to so that we can ensure that any bad copies can be removed?

https://rocm.docs.amd.com/projects/HIP/en/docs-6.0.0/developer_guide/build.html#:~:text=All%20tests%20passed-,Build%20HIP%20on%20NVIDIA%20platform,-%23

mangupta commented 2 weeks ago

@mredenti : That link is only valid for ROCm 6.0. Since you are using the a newer source you should refer the latest build instructions (https://rocm.docs.amd.com/projects/HIP/en/latest/install/build.html) or ROCm 6.1 build instructions (https://rocm.docs.amd.com/projects/HIP/en/docs-6.1.0/install/build.html)