ROCm / rocm-cmake

CMake modules used within the ROCm libraries
https://rocm.docs.amd.com/projects/ROCmCMakeBuildTools/en/latest/
MIT License
62 stars 43 forks source link

Fix use of LIBDIR #95

Closed Mystro256 closed 2 years ago

Mystro256 commented 2 years ago

This issue was found with the help of @cgmb , it looks like a lot of components have added this to their cmake files:

# Force library install path to lib (CentOS 7 defaults to lib64)
set(CMAKE_INSTALL_LIBDIR "lib" CACHE INTERNAL "Installation directory for libraries" FORCE)

This is not a good idea, as non-rocm builders, such as distros, will want to override the value of CMAKE_INSTALL_LIBDIR. While the use of GNUInstallDirs is a good idea in ROCMInstallTargets, as it will set sane defaults to the CMAKEINSTALL* variables, it also has an unintended side effect of trying to guess what to use for libdir.

We saw this issue in other components, so @frepaul had a good suggestion to set libdir prior to including GNUInstallDirs. This allows for us to have "lib" as a predicable default, but still have the advantage of overriding directories, as intended by GNUInstallDirs.

I've already implemented this logic in ROCr and OpenCL, but they haven't been merged into github yet.

cgmb commented 2 years ago

I approve in principle, but there's only one development branch for rocm-cmake. A significant change like this should wait until all the fixes for ROCm 5.2.0 are sorted out.

The rocSOLVER repo has had this awful line literally since the project was first created:

# Force library install path to lib (CentOS 7 defaults to lib64)
set(CMAKE_INSTALL_LIBDIR "lib" CACHE INTERNAL "Installation directory for libraries" FORCE)

With this change, we should be able to delete the same code in:

Mystro256 commented 2 years ago

Out of curiosity, is there a reason lib64 is avoided? I assume this is just the ROCm install requirement, right? I.e. because that's where it's documented/expected to be.

Just making sure there's no technical limitation or it's one of those assumed hard coded paths somewhere. The distros wanting to package these components will want to customise the libdir path, although it's good practice to avoid any hardcoded install paths period.

On May 7, 2022 4:19:00 p.m. EDT, Cory Bloor @.***> wrote:

I approve in principle, but there's only one development branch for rocm-cmake. A significant change like this should wait until all the fixes for ROCm 5.2.0 are sorted out.

The rocSOLVER repo has had this awful line literally since the project was first created:

# Force library install path to lib (CentOS 7 defaults to lib64)
set(CMAKE_INSTALL_LIBDIR "lib" CACHE INTERNAL "Installation directory for libraries" FORCE)

With this change, we should be able to delete the same code in:

-- Reply to this email directly or view it on GitHub: https://github.com/RadeonOpenCompute/rocm-cmake/pull/95#issuecomment-1120282125 You are receiving this because you authored the thread.

Message ID: @.***>

cgmb commented 2 years ago

Out of curiosity, is there a reason lib64 is avoided? I assume this is just the ROCm install requirement, right?

I've checked https://github.com/ROCmSoftwarePlatform/rocBLAS/pull/249 and https://github.com/ROCmSoftwarePlatform/rocFFT/pull/124. Unfortunately, there was no reason given. I'd assume it was just to have the same /opt/rocm layout across all distros. @pfultz2 may know more, as he was around for that change.

Mystro256 commented 2 years ago

Unfortunately, there was no reason given. I'd assume it was just to have the same /opt/rocm layout across all distros

Yes this is what I assumed too. If so, how I implemented this is fine as-is, if not, we should be wary of users/distros/community people setting a custom LIBDIR.

E.g. Fedora/RHEL/SUSE use CMAKE_INSTALL_LIBDIR="lib64", Debian/Ubuntu use CMAKE_INSTALL_LIBDIR="lib/$(uname -m)-linux-gnu".

pfultz2 commented 2 years ago

@pfultz2 may know more, as he was around for that change.

Since we dont have support for non-cmake users(like bazel), I think they are using hard-coded paths(such as -L/opt/rocm/lib -l<some-lib>), which switching to lib64 would break them. Once we provide pkg-config we could probably drop this.

pfultz2 commented 2 years ago

Yes this is what I assumed too. If so, how I implemented this is fine as-is, if not, we should be wary of users/distros/community people setting a custom LIBDIR.

E.g. Fedora/RHEL/SUSE use CMAKE_INSTALL_LIBDIR="lib64", Debian/Ubuntu use CMAKE_INSTALL_LIBDIR="lib/$(uname -m)-linux-gnu".

This is another good reason why we should just have a variable to switch layouts rather than allowing CMAKE_INSTALL_LIBDIR to be overidden by users since it could lead to even stranger layouts.

Mystro256 commented 2 years ago

Unfortunately, not allowing arbitrary dir values isn't very distro friendly. There's no one size fit all solution as there are at least 3 or 4 interpretations of the FHS out there for LIBDIR specifically, hence why gnuinstalldirs is designed to allow overriding.

I notice most cmake based projects deal with this by not assuming any paths and often using configure_file if the path is needed in source or a script.

Also, we don't strictly need to use gnuinstalldirs, as some projects use {LIB,BIN,INCLUDE}_INSTALL_DIR (I think). I just noticed that gnuinstalldirs is already in used but the FORCE lines sort of defeat the purpose.

On May 9, 2022 7:14:29 p.m. EDT, Paul Fultz II @.***> wrote:

Yes this is what I assumed too. If so, how I implemented this is fine as-is, if not, we should be wary of users/distros/community people setting a custom LIBDIR.

E.g. Fedora/RHEL/SUSE use CMAKE_INSTALL_LIBDIR="lib64", Debian/Ubuntu use CMAKE_INSTALL_LIBDIR="lib/$(uname -m)-linux-gnu".

This is another good reason why we should just have a variable to switch layouts rather than allowing CMAKE_INSTALL_LIBDIR to be overidden by users since it could lead to even stranger layouts.

-- Reply to this email directly or view it on GitHub: https://github.com/RadeonOpenCompute/rocm-cmake/pull/95#issuecomment-1121674370 You are receiving this because you authored the thread.

Message ID: @.***>

Mystro256 commented 2 years ago

Sorry I think I misunderstood your comment, email doesn't show the inline comments, so I didn't know what you meant. Edit: Nevermind my email client is silly :)

saadrahim commented 2 years ago

Please resolve conflicts and the consensus of rocm-cmake maintainers is to merge as is.

saadrahim commented 2 years ago

Please update the version number and add to the changelog comments please.

cgmb commented 2 years ago

This may need a version increment.

That is true. However, the change log already has a few entries that should be moved to the next version. I'll open a separate PR to fix them all at once. That will be a good place to discuss how to properly manage bumping the version around ROCm releases.

cgmb commented 2 years ago

After further discussion, I've bumped the version and fixed the changelog entries. rocm-cmake 0.7.3 is going to be the one that gets released in ROCm 5.2, but the PRs that were merged today will not be part of that. I've moved the changelog entries that were added today to rocm-cmake 0.8.0.

Mystro256 commented 2 years ago

Thanks, I was a bit too busy to follow up, glad it could be merged.

There was some discussion, and I think there's better approaches, but this is a good short-term solution to avoid the "hacks".

On May 26, 2022 9:35:13 p.m. EDT, Cory Bloor @.***> wrote:

Merged #95 into master.

-- Reply to this email directly or view it on GitHub: https://github.com/RadeonOpenCompute/rocm-cmake/pull/95#event-6689484791 You are receiving this because you were mentioned.

Message ID: @.***>

pramenku commented 2 years ago

Are we going to have all libs for rpm based inside "lib64" now?

This change will have library files inside "lib" path for Ubuntu but for SLES,RHEL,CentOS, it will go inside "lib64"

Are we doing this change using any feature ticket targeted to any rocm release as it's major change related to library path?

Sample. Before the change.

ls -lrt /opt/rocm-5.3.0-10193/rocfft/lib/
total 0
lrwxrwxrwx. 1 root root 22 May 27 23:42 librocfft.so -> ../../lib/librocfft.so
lrwxrwxrwx. 1 root root 31 May 27 23:42 librocfft-device-3.so -> ../../lib/librocfft-device-3.so
lrwxrwxrwx. 1 root root 31 May 27 23:42 librocfft-device-2.so -> ../../lib/librocfft-device-2.so
lrwxrwxrwx. 1 root root 31 May 27 23:42 librocfft-device-1.so -> ../../lib/librocfft-device-1.so
lrwxrwxrwx. 1 root root 31 May 27 23:42 librocfft-device-0.so -> ../../lib/librocfft-device-0.so
drwxr-xr-x. 2 root root 132 May 28 01:12 cmake

After the change.


ls -lrt /opt/rocm-5.3.0-10195/rocfft/lib64/
total 0
lrwxrwxrwx. 1 root root 24 May 31 01:04 librocfft.so -> ../../lib64/librocfft.so
lrwxrwxrwx. 1 root root 33 May 31 01:04 librocfft-device-3.so -> ../../lib64/librocfft-device-3.so
lrwxrwxrwx. 1 root root 33 May 31 01:04 librocfft-device-2.so -> ../../lib64/librocfft-device-2.so
lrwxrwxrwx. 1 root root 33 May 31 01:04 librocfft-device-1.so -> ../../lib64/librocfft-device-1.so
lrwxrwxrwx. 1 root root 33 May 31 01:04 librocfft-device-0.so -> ../../lib64/librocfft-device-0.so
drwxr-xr-x. 2 root root 132 May 31 02:33 cmake

Thanks.

pramenku commented 2 years ago

Also, hope we are communicating to all MLSE Dev too.

This change has broken hipfort and few more apps which always looks for /opt/rocm-xx/lib. Please refer https://ontrack-internal.amd.com/browse/SWDEV-339728

Not sure what we are communicating to Application owner but everyone should be informed 1st before we should have merged the change.

http://compute-artifactory.amd.com/artifactory/rocm-qa-test-logs/test-logs/ixt-hq-54_521_2022-06-01_01:23/mlpmo_file_521_48.log

We need get everyone aligned and get all changes together or may be 1 day delay.

Thanks.

cgmb commented 2 years ago

There was no behaviour change intended. It's merely a bug showing up in integration testing. I've opened #96 to correct the problem.

Mystro256 commented 2 years ago

Yes please merge #96 to resolve this integration bug.

On June 1, 2022 5:55:36 a.m. EDT, Cory Bloor @.***> wrote:

There was no behaviour change intended. It's merely a bug showing up in integration testing. I've opened #96 to correct the problem.

-- Reply to this email directly or view it on GitHub: https://github.com/RadeonOpenCompute/rocm-cmake/pull/95#issuecomment-1143387699 You are receiving this because you were mentioned.

Message ID: @.***>