GPUOpen-LibrariesAndSDKs / Orochi

MIT License
204 stars 32 forks source link

Test64D failure on checking `hiprtcLinkCreate` API in `oroInitialize` #45

Closed yasuo-ozu closed 2 years ago

yasuo-ozu commented 2 years ago

Hello.

I tried to run Test64D app on Linux x64 system, but I got the error:

$ gdb dist/bin/Debug/Test64D
...
(gdb) run
...
Test64D: ../contrib/hipew/src/hipew.cpp:470: int hipewHipInit(): Assertion `hiprtcLinkCreate' failed.
Program received signal SIGABRT, Aborted.

(gdb) backtrace
#0  0x00007ffff7af234c in __pthread_kill_implementation () from /usr/lib/libc.so.6
#1  0x00007ffff7aa54b8 in raise () from /usr/lib/libc.so.6
#2  0x00007ffff7a8f534 in abort () from /usr/lib/libc.so.6
#3  0x00007ffff7a8f45c in __assert_fail_base.cold () from /usr/lib/libc.so.6
#4  0x00007ffff7a9e116 in __assert_fail () from /usr/lib/libc.so.6
#5  0x0000555555562844 in hipewHipInit () at ../contrib/hipew/src/hipew.cpp:470
#6  0x00005555555629db in hipewInit (flags=1) at ../contrib/hipew/src/hipew.cpp:487
#7  0x000055555555655a in oroInitialize (api=ORO_API_HIP, flags=0) at ../Orochi/Orochi.cpp:105
#8  0x000055555555b138 in main (argc=1, argv=0x7fffffffe3e8) at ../Test/main.cpp:31

And hipew.cpp:470 is https://github.com/GPUOpen-LibrariesAndSDKs/Orochi/blob/ac7f0ab7f537d6724fd6e9228f0539391e730230/contrib/hipew/src/hipew.cpp#L470

which is checking the existence of API named hiprtcLinkCreate. So I checked the name in my .so file:

readelf /opt/rocm/hip/lib/libamdhip64.so -s --wide | grep hiprtc
   413: 00000000002c7f40  1084 FUNC    GLOBAL DEFAULT   10 hiprtcAddNameExpression@@hip_4.2
   437: 00000000002c9250  4267 FUNC    GLOBAL DEFAULT   10 hiprtcCreateProgram@@hip_4.2  
   520: 00000000002c89e0  1511 FUNC    GLOBAL DEFAULT   10 hiprtcCompileProgram@@hip_4.2                  
   547: 00000000002c86a0   832 FUNC    GLOBAL DEFAULT   10 hiprtcGetCode@@hip_4.2
   590: 00000000002c8380   792 FUNC    GLOBAL DEFAULT   10 hiprtcGetProgramLog@@hip_4.2
   599: 00000000002c6ae0   331 FUNC    GLOBAL DEFAULT   10 hiprtcGetErrorString@@hip_4.2
   601: 00000000002c76d0   610 FUNC    GLOBAL DEFAULT   10 hiprtcVersion@@hip_4.2
   603: 00000000002c8fd0   638 FUNC    GLOBAL DEFAULT   10 hiprtcGetLoweredName@@hip_4.2
   612: 00000000002c7940   758 FUNC    GLOBAL DEFAULT   10 hiprtcGetCodeSize@@hip_4.2
   616: 00000000002c7c40   758 FUNC    GLOBAL DEFAULT   10 hiprtcGetProgramLogSize@@hip_4.2
   690: 00000000002c6c30  2716 FUNC    GLOBAL DEFAULT   10 hiprtcDestroyProgram@@hip_4.2

no hiprtcLinkCreate found. Any ideas? thanks in advance.


jammm commented 2 years ago

Thanks for bringing this up. Looks like Orochi hits an assert if it couldn't find a function in debug mode. Sadly we don't have the hiprtcLink* APIs available on Linux yet, but it should come eventually. I'll get back to you here in case we have a way to bring those to Orochi.

In the mean time, you can try removing those asserts (remove the _CHECKED part) and that particular test should work fine.

yasuo-ozu commented 2 years ago

Thanks. I send PR to modify README to inform Linux users about it.

jammm commented 2 years ago

I think a better way could be to use an older commit of orochi. Can you try this commit? https://github.com/GPUOpen-LibrariesAndSDKs/Orochi/tree/d78fb813e6ca19309313119052462ef8882c9908

takahiroharada commented 2 years ago

Yes using an older version of Orochi is the right solution as linux hip is just behind.