GPUOpen-LibrariesAndSDKs / HIPRTSDK

Other
55 stars 8 forks source link

hiprtBuildGeometry access violation #7

Closed nolmoonen closed 1 year ago

nolmoonen commented 2 years ago

Hi, I'm trying to run the tutorials on an RTX 2070 under Windows. I am using CUDA 11.1, so in order to get the 00_context_creation tutorial running I've had to add "nvrtc64_111_0.dll" to nvrtc_paths in cuewNvrtcInit of ceuw.cpp in the Orochi dependency (and add the directory that DLL is located in, to the path). That tutorial runs fine now.

However, I cannot run the other tutorials, it seems an access violation occurs inside of hiprtBuildGeometry. When I set the log level to something high (10), I get as output

executing on NVIDIA GeForce RTX 2070
FastBuild::getGeometryBuildTempBufferSize
FastBuild::createGeometry
FastBuild::buildGeometry

and the error in VS2019 is

'01_geom_intersection64D.exe' (Win32): Loaded 'E:\HIPRTSDK\tutorials\dist\bin\Debug\01_geom_intersection64D.exe'. Symbols loaded.
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\ntdll.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\kernel32.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\KernelBase.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\version.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\msvcrt.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'E:\HIPRTSDK\tutorials\build\hiprt64.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\ucrtbase.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\msvcp140d.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\vcruntime140d.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\vcruntime140_1d.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\ucrtbased.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\msvcp140.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\vcruntime140.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\vcruntime140_1.dll'. 
The thread 0x6140 has exited with code 0 (0x0).
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\nvcuda.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\advapi32.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\sechost.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\rpcrt4.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\gdi32.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\win32u.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\gdi32full.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\msvcp_win.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\user32.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\imm32.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\DriverStore\FileRepository\nv_dispi.inf_amd64_47917a79b8c7fd22\nvcuda64.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin\cudart64_110.dll'. Module was built without symbols.
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin\nvrtc64_111_0.dll'. Module was built without symbols.
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\dbghelp.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\msasn1.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\cryptnet.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\crypt32.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\drvstore.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\devobj.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\cfgmgr32.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\wldp.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\cryptbase.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\nvapi64.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\setupapi.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\bcrypt.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\shell32.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\shlwapi.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\kernel.appcore.dll'. 
Exception thrown at 0x0000000000000000 in 01_geom_intersection64D.exe: 0xC0000005: Access violation executing location 0x0000000000000000.

Can you help me identify the issue? Or is it simply the case that my GPU and/or CUDA version are not supported?

PixelClear commented 2 years ago

Are you trying out the latest release V1.2.0? You can download it from https://gpuopen.com/hiprt/?utm_source=twitter&utm_medium=social&utm_campaign=hiprt.

If you still face this issue please tell us we will look into it.

nolmoonen commented 2 years ago

I got confused trying to confirm this. What is the intended way to build and run the tutorials?

  1. Clone this repository, and download and copy the SDK binaries into hiprt.
    • If I do this, I run into the error above.
    • However, I also get an error running the premake script, since it cannot find Orochi/contrib/bin. This is because the Orochi submodule is tagged with c5d3934, which misses the Orochi/contrib/bin directory and contents (required in the premake script).
    • If I manually update the Orochi submodule in contrib/Orochi to the latest main, then I get linker errors of the form Severity Code Description Project File Line Suppression State Error LNK2019 unresolved external symbol "enum orortcResult __cdecl orortcGetBitcode(struct _orortcProgram *,char *)" (?orortcGetBitcode@@YA?AW4orortcResult@@PEAU_orortcProgram@@PEAD@Z) referenced in function "public: static void __cdecl OrochiUtils::getData(int,char const *,char const *,class std::vector<char const *,class std::allocator<char const *> > *,class std::vector<char,class std::allocator<char> > &)" (?getData@OrochiUtils@@SAXHPEBD0PEAV?$vector@PEBDV?$allocator@PEBD@std@@@std@@AEAV?$vector@DV?$allocator@D@std@@@3@@Z) 00_context_creation E:\HIPRTSDK\tutorials\build\OrochiUtils.obj 1 so perhaps the latest main of Orochi is not the correct one?
  2. Or, download the contents via the link you provided, which also contains the tutorials and build and run from there.
    • This also gives me the error above. But I do not think this is intended, as it does not include the latest HIPRTSDK commits.
takahiroharada commented 2 years ago

Indeed Orochi version on this repo was out dated. I fixed this. We'll check your issue on an nvidia gpu.

nolmoonen commented 2 years ago

Thank you for checking the issue out.

From what I can tell, the Orochi version in this repository has not changed. If you can tell me what the correct version should be, I could try it myself.

nolmoonen commented 1 year ago

I tried with the latest 2.0.0 release, the issue is not solved. The error comes from hiprtBuildTraceKernelsFromBitcode now instead. RTX 2070 with Windows and CUDA 11.1.

I noticed that the build scripts look for amdhip64.dll but from what I can tell, this file is not installed by the AMDGPU driver if no AMD GPU is present in the system. Perhaps that could be the cause of the issue?

PixelClear commented 1 year ago

HIPRT 2.0.3 version is out and this issue should be fixed in it. Can you give it a try?

nolmoonen commented 1 year ago

I've just tried version 2.0.3 and the issue is not resolved, I get the same error:

'01_geom_intersection64D.exe' (Win32): Loaded 'E:\HIPRTSDK\tutorials\dist\bin\Debug\01_geom_intersection64D.exe'. Symbols loaded.
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\ntdll.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\kernel32.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\KernelBase.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\version.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\msvcrt.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'E:\HIPRTSDK\tutorials\build\hiprt0200064.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\ucrtbase.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\msvcp140d.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\ole32.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\rpcrt4.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\combase.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\gdi32.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\win32u.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\gdi32full.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\msvcp_win.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\user32.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\vcruntime140d.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\vcruntime140_1d.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\ucrtbased.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\msvcp140.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\vcruntime140.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\vcruntime140_1.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\imm32.dll'. 
The thread 0x2840 has exited with code 0 (0x0).
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\nvcuda.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\advapi32.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\sechost.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\DriverStore\FileRepository\nv_dispi.inf_amd64_675be35f1ba2315e\nvcuda64.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin\nvrtc64_111_0.dll'. Module was built without symbols.
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\dbghelp.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'E:\HIPRTSDK\tutorials\build\hiprtc0503.dll'. Module was built without symbols.
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\msasn1.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\cryptnet.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\crypt32.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\drvstore.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\devobj.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\cfgmgr32.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\wldp.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\cryptbase.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\nvapi64.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\setupapi.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\bcrypt.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\shell32.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\shlwapi.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Windows\System32\kernel.appcore.dll'. 
'01_geom_intersection64D.exe' (Win32): Loaded 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin\nvrtc-builtins64_111.dll'. Module was built without symbols.
Exception thrown at 0x0000000000000000 in 01_geom_intersection64D.exe: 0xC0000005: Access violation executing location 0x0000000000000000.

The program '[15996] 01_geom_intersection64D.exe' has exited with code 0 (0x0).

Is there anything else I can try or information I can provide?

Edit: I have also tried with the Linux binaries, and I am getting a (non-descriptive) crash on hiprtBuildTraceKernelsFromBitcode as well. CUDA 11.8 on Ubuntu 22.04, RTX 2070.

PixelClear commented 1 year ago

Sorry for the delay in reply, we are investigating this and will get back to you.

nolmoonen commented 1 year ago

Some additional details I found when trying to run on Linux (as mentioned above, it builds fine but crashes on hiprtBuildTraceKernelsFromBitcode).

Adding the line hiprtSetLogLevel(hiprtLogLevelInfo); gives the following output for 01_geom_intersection64D:

hiprt ver.02000
Executing on 'NVIDIA GeForce RTX 2070'
FastBuild::getGeometryBuildTempBufferSize
FastBuild::createGeometry
FastBuild::buildGeometry
Orortc error: 'NVRTC_ERROR unknown' [ 209 ] on line 314 in '../hiprt/impl/Compiler.cpp'

Perhaps that error message is of use.

Inspired by the other open issue, I've had a look at strace and one part stood out to me:

openat(AT_FDCWD, "/opt/rocm/hip/lib/libamdhip64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/opt/rocm/hip/lib/libhiprtc.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)

these files do not exist for my HIP installation (on NVIDIA platform).

nolmoonen commented 1 year ago

After updating to CUDA 12.2 (from 11.1) the issue is resolved, both on Windows and Ubuntu. Perhaps including version requirements will help future users.