ROCm / rocm_smi_lib

ROCm SMI LIB
https://rocm.docs.amd.com/projects/rocm_smi_lib/en/latest/
MIT License
111 stars 48 forks source link

new rsmi_pkg_ver git tags missing from public repo prevents rocm-smi to find librocm_smi64.so.7 #178

Closed lamikr closed 2 weeks ago

lamikr commented 3 weeks ago

Problem Description

It seems that rsmi_pkg_ver tags are not synced from AMD's internal git to https://github.com/ROCm/rocm_smi_lib

It has some old tags like: rsmi_pkg_ver-2.8.0 rsmi_pkg_ver-2.9.0 rsmi_pkg_ver-3.0.0

This causes that rocm_smi_lib CMakeLists.txt call on line 41 get_package_version_number("7.0.0" ${PKG_VERSION_GIT_TAG_PREFIX} GIT) ends up returning the version Package version: 2.8.0.37-local-build-0-b224d7b-dirty instead of 7.0.0. As a result the libraries are build with old 2.8.0 version tags

ls -la librocm_smi64.so*
lrwxrwxrwx 1 lamikr lamikr      18 Jun  5 01:50 librocm_smi64.so -> librocm_smi64.so.2*
lrwxrwxrwx 1 lamikr lamikr      20 Jun  5 01:50 librocm_smi64.so.2 -> librocm_smi64.so.2.8*
-rwxr-xr-x 1 lamikr lamikr 1236104 Jun  5 01:50 librocm_smi64.so.2.8

rsmiBindings.py has however hardcoded version number search to path_librocm = os.path.dirname(os.path.realpath(__file__)) + '/../../lib/librocm_smi64.so.7' and therefore the rocm-smi fails.

Operating System

Ubuntu 22.04

CPU

ryzen 7 5700

GPU

AMD Radeon RX 7900 XT

ROCm Version

ROCm 6.1.0

ROCm Component

rocm_smi_lib

Steps to Reproduce

git clone https://github.com/ROCm/rocm_smi_lib.git
cd rocm_smi_lib
mkdir build
cd build
cmake ..

And it will printout instead of expected 7..0.0 a 2.8.9 like following:

Package version: 2.8.0.37-local-build-0-2fd36e3

If I create the tag manually git cloned git repo, then it will return the expected 7.0.0

git tag rsmi_pkg_ver-7.0.0
cmake ..

then it prints ok: Package version: 7.0.0.325-local-build-0-2fd36e3

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

lamikr commented 3 weeks ago

Proably following tags would also be needed to be synced to public github.

oam_so_ver rsmi_so_ver

They are used from oam/CMakeLists.txt and rocm_smi/CMakeLists.txt and will default to 1.0 version of library.

dmitrii-galantsev commented 3 weeks ago

that's absurd. Fixing in a sec.

dmitrii-galantsev commented 3 weeks ago

@lamikr I just pushed out these:

Also added these to amdsmi:

Could you please check it this fixes it?

Props to you for digging through the version scripts! It's a headache.

dmitrii-galantsev commented 3 weeks ago

need the oam tags still? I don't have those myself.

lamikr commented 3 weeks ago

Thanks @dmitrii-galantsev, tags fixed the problem and and librocm_smi64.so is now installed with correct soname as as librocm_smi64.so.7.0 and librocm_smi64.so.7 and rocm-smi python code can now find it. (later I would like to add another patch to also search the lib64 folder, but that's another story...)

As there are no tags for liboam, it's get installed as liboam.so.1.0 but I think that's ok, as at least I could not find any rocm code which would try to search liboam.so.7

Just as a reference, this is the make install command after tags are in.

-- Install configuration: "Release" -- Installing: /opt/rocm_sdk_611/lib64/librocm_smi64.so.7.0 -- Installing: /opt/rocm_sdk_611/lib64/librocm_smi64.so.7 -- Installing: /opt/rocm_sdk_611/lib64/librocm_smi64.so -- Up-to-date: /opt/rocm_sdk_611/lib64/librocm_smi64.so.7.0 -- Up-to-date: /opt/rocm_sdk_611/lib64/librocm_smi64.so.7 -- Up-to-date: /opt/rocm_sdk_611/lib64/librocm_smi64.so -- Up-to-date: /opt/rocm_sdk_611/include/rocm_smi/rocm_smi.h -- Up-to-date: /opt/rocm_sdk_611/include/rocm_smi/rocm_smi64Config.h -- Up-to-date: /opt/rocm_sdk_611/include/rocm_smi/kfd_ioctl.h -- Up-to-date: /opt/rocm_sdk_611/libexec/rocm_smi/rsmiBindingsInit.py -- Up-to-date: /opt/rocm_sdk_611/libexec/rocm_smi/rsmiBindings.py -- Up-to-date: /opt/rocm_sdk_611/libexec/rocm_smi/rocm_smi.py -- Up-to-date: /opt/rocm_sdk_611/bin/rocm-smi -- Up-to-date: /opt/rocm_sdk_611/lib64/liboam.so.1.0 -- Up-to-date: /opt/rocm_sdk_611/lib64/liboam.so.1 -- Up-to-date: /opt/rocm_sdk_611/lib64/liboam.so -- Up-to-date: /opt/rocm_sdk_611/lib64/liboam.so.1.0 -- Up-to-date: /opt/rocm_sdk_611/lib64/liboam.so.1 -- Up-to-date: /opt/rocm_sdk_611/lib64/liboam.so -- Up-to-date: /opt/rocm_sdk_611/include/oam/oam_mapi.h -- Up-to-date: /opt/rocm_sdk_611/include/oam/amd_oam.h -- Up-to-date: /opt/rocm_sdk_611/lib64/cmake/rocm_smi/rocm_smi-config.cmake -- Up-to-date: /opt/rocm_sdk_611/lib64/cmake/rocm_smi/rocm_smi-config-version.cmake -- Up-to-date: /opt/rocm_sdk_611/lib64/cmake/rocm_smi/rocm_smiTargets.cmake -- Up-to-date: /opt/rocm_sdk_611/lib64/cmake/rocm_smi/rocm_smiTargets-release.cmake -- Up-to-date: /opt/rocm_sdk_611/share/doc/rocm_smi/LICENSE.txt

dmitrii-galantsev commented 2 weeks ago

also search the lib64 folder

Can it not find it in lib64? If you set CMAKE_INSTALL_LIBDIR - it should all work. https://github.com/ROCm/rocm_smi_lib/blob/2fd36e33adaa29307c35d24454f322605fe329e7/CMakeLists.txt#L18

it's later used here https://github.com/ROCm/rocm_smi_lib/blob/2fd36e33adaa29307c35d24454f322605fe329e7/python_smi_tools/rsmiBindings.py.in#L29

Anyway - closing the issue. Thanks!