lamikr / rocm_sdk_builder

Other
113 stars 8 forks source link

004_01_roct-thunk-interface_shared fails to build during DEB/RPM package build #16

Closed paulmenzel closed 3 weeks ago

paulmenzel commented 1 month ago

Without dpkg or rpmbuild installed, the build errors out with:

[…]
[100%] Linking C shared library libhsakmt.so
[100%] Built target hsakmt
Run CPack packaging tool...
CPack: Create package using DEB
CPack: Install projects
CPack: - Run preinstall target for: hsakmt
CPack: - Install project: hsakmt []
CPack: -   Install component: devel
CPack: Create package
-- CPackDeb: Can not find dpkg in your path, default to i386.
CPack: - package: /scratch/local2/pmenzel/rocm_sdk_builder/builddir/004_01_roct-thunk-interface_shared/hsakmt-roct-dev_6.1.1.60101-local_.deb generated.
CPack: Create package using RPM
CPack: Install projects
CPack: - Run preinstall target for: hsakmt
CPack: - Install project: hsakmt []
CPack: -   Install component: devel
CPack: Create package
CMake Error at /usr/share/cmake-3.25/Modules/Internal/CPack/CPackRPM.cmake:822 (message):
  RPM package requires rpmbuild executable
Call Stack (most recent call first):
  /usr/share/cmake-3.25/Modules/Internal/CPack/CPackRPM.cmake:1968 (cpack_rpm_generate_package)

CPack Error: Error while execution CPackRPM.cmake
CPack Error: Problem compressing the directory
CPack Error: Error when generating package: hsakmt
make: *** [Makefile:71: package] Error 1
build failed: ROCT-Thunk-Interface_shared
  error in build cmd: make package

build failed

make package seems to be explicitly called:

https://github.com/lamikr/rocm_sdk_builder/blob/50e36ce7fcd9f14f0accb3b6907c80757ca09505/binfo/004_01_roct-thunk-interface_shared.binfo#L13-L15

I was able to build it manually just with make, but changing the binfo file from make package to make resulted in the same error. (No idea, if some build scripts need to be regenerated.)

flip111 commented 1 month ago

Which OS version are you running?

aidanharris commented 1 month ago

I see this too on Gentoo. I'm installing app-arch/rpm now to see if that fixes it but should the building of debs and rpms be disabled somehow? I'm not sure how CMake triggers it but it seems wrong to me.

aidanharris commented 1 month ago

I see this too on Gentoo. I'm installing app-arch/rpm now to see if that fixes it but should the building of debs and rpms be disabled somehow? I'm not sure how CMake triggers it but it seems wrong to me.

This worked but then it failed on roctracer. Seems to be another Gcc 14 issue:

[ 94%] Linking HIP executable MatrixTranspose_hipaact_test
cd /home/ahrs/repos/rocm_sdk_builder/builddir/012_roctracer/test && /usr/bin/cmake -E cmake_link_script CMakeFiles/MatrixTranspose_hipaact_test.dir/link.txt --verbose=1
/opt/rocm_sdk_611/bin/hipcc_cmake_linker_helper /opt/rocm_sdk_611/bin  -no-pie -Wl,--build-id=md5 CMakeFiles/MatrixTranspose_hipaact_test.dir/app/MatrixTranspose_hipaact_test_generated_MatrixTransp
ose_test.cpp.o -o MatrixTranspose_hipaact_test  -Wl,-rpath,/home/ahrs/repos/rocm_sdk_builder/builddir/012_roctracer:/opt/rocm_sdk_611/lib64: ../libroctracer64.so.4.1.0 ../libroctx64.so.4.1.0 -Wl,-r
path-link,/opt/rocm_sdk_611/lib64
/usr/lib/gcc/x86_64-pc-linux-gnu/14/../../../../x86_64-pc-linux-gnu/bin/ld: CMakeFiles/MatrixTranspose_mgpu.dir/app/MatrixTranspose_mgpu_generated_MatrixTranspose_test.cpp.o: in function `main':
MatrixTranspose_test.cpp:(.text+0x360): undefined reference to `hipGetDevicePropertiesR0600'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [test/CMakeFiles/MatrixTranspose_mgpu.dir/build.make:92: test/MatrixTranspose_mgpu] Error 1
make[2]: Leaving directory '/home/ahrs/repos/rocm_sdk_builder/builddir/012_roctracer'
make[1]: *** [CMakeFiles/Makefile2:631: test/CMakeFiles/MatrixTranspose_mgpu.dir/all] Error 2
HIP_PATH=/opt/rocm_sdk_611
HIP_PLATFORM=amd
HIP_COMPILER=clang
HIP_RUNTIME=rocclr
ROCM_PATH=/opt/rocm_sdk_611
HIP_ROCCLR_HOME=/opt/rocm_sdk_611
HIP_CLANG_PATH=/opt/rocm_sdk_611/bin
HIP_INCLUDE_PATH=/opt/rocm_sdk_611/include
HIP_LIB_PATH=/opt/rocm_sdk_611/lib
DEVICE_LIB_PATH=/opt/rocm_sdk_611/amdgcn/bitcode
HIP_CLANG_RT_LIB=/opt/rocm_sdk_611/lib/clang/17/lib/linux
hipcc-args: -no-pie -Wl,--build-id=md5 CMakeFiles/MatrixTranspose_hipaact_test.dir/app/MatrixTranspose_hipaact_test_generated_MatrixTranspose_test.cpp.o -o MatrixTranspose_hipaact_test -Wl,-rpath,/
home/ahrs/repos/rocm_sdk_builder/builddir/012_roctracer:/opt/rocm_sdk_611/lib64: ../libroctracer64.so.4.1.0 ../libroctx64.so.4.1.0 -Wl,-rpath-link,/opt/rocm_sdk_611/lib64
hipcc-cmd: "/opt/rocm_sdk_611/bin/clang" --driver-mode=g++ -O3 --hip-path="/opt/rocm_sdk_611" --hip-link --rtlib=compiler-rt -unwindlib=libgcc  -no-pie -Wl,--build-id=md5 CMakeFiles/MatrixTranspose
_hipaact_test.dir/app/MatrixTranspose_hipaact_test_generated_MatrixTranspose_test.cpp.o -o "MatrixTranspose_hipaact_test" -Wl,-rpath,/home/ahrs/repos/rocm_sdk_builder/builddir/012_roctracer\:/opt/r
ocm_sdk_611/lib64\: ../libroctracer64.so.4.1.0 ../libroctx64.so.4.1.0 -Wl,-rpath-link,/opt/rocm_sdk_611/lib64
/usr/lib/gcc/x86_64-pc-linux-gnu/14/../../../../x86_64-pc-linux-gnu/bin/ld: CMakeFiles/MatrixTranspose_hipaact_test.dir/app/MatrixTranspose_hipaact_test_generated_MatrixTranspose_test.cpp.o: in fun
ction `main':
MatrixTranspose_test.cpp:(.text+0x322): undefined reference to `hipGetDevicePropertiesR0600'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
lamikr commented 1 month ago

I need to check this. To be honest it's really long time when I have last time touched to this package that I do not remember why it's calling "make package". Did you try to change that to "make install"

Maybe the debian packaging is initiated by cmake when it detects some tools that I do not have in my environment.

paulmenzel commented 1 month ago

Replacing make package by make actually worked. I missed that there are two packages with similar names, and ran into the same error with …_static (instead of …_shared.

  1. binfo/004_01_roct-thunk-interface_shared.binfo
  2. binfo/004_02_roct-thunk-interface_static.binfo

Can you do the change, or do you prefer merge/pull requests?

daniandtheweb commented 4 weeks ago

Replacing make package by make actually worked. I missed that there are two packages with similar names, and ran into the same error with …_static (instead of …_shared.

1. `binfo/004_01_roct-thunk-interface_shared.binfo`

2. `binfo/004_02_roct-thunk-interface_static.binfo`

Can you do the change, or do you prefer merge/pull requests?

Great finding. I was already making a pr for adding support to Arch so I already made the change in there in order to push the pr without those dependencies.

lamikr commented 4 weeks ago

Thanks for the good finding, the "make package" has been there due to my own testings for really long time and never noticed any problems. I applied the Danielos fix from pull-request for this one now.

I assume its ok to close this now if somebody can verify?

lamikr commented 3 weeks ago

Closing as not seeing any reports of problems after change of "make package" to "make"

paulmenzel commented 3 weeks ago

Yes, fixed by the commits below:

  1. https://github.com/lamikr/rocm_sdk_builder/pull/27/commits/3460660d2a0faa5c57e70545147be8f2f6264ff3
  2. https://github.com/lamikr/rocm_sdk_builder/pull/27/commits/bd8e3ff668d2a6a584835076fcc942f2de37d9c6