ROCm / rocALUTION

Next generation library for iterative sparse solvers for ROCm platform
https://rocm.docs.amd.com/projects/rocALUTION/en/latest/
MIT License
74 stars 36 forks source link

Files tagged with HIP_SOURCE_PROPERTY_FORMAT generate wrong HIP_CLANG_PATH #174

Open tpkessler opened 1 year ago

tpkessler commented 1 year ago

Hi! I'm packaging ROCm for Arch Linux. With HIP 5.5.1 I encountered the issue that HIP_CLANG_PATH was pointing to /opt/rocm/llvm (instead of the correct /opt/rocm/llvm/bin) as the env var with the same name is overwritten by the cmake macro HIP_PREPARE_TARGET_COMMANDS declared in FindHIP.cmake. If HIP_SOURCE_PROPERTY_FORMAT is set to TRUE, cmake modifies the env vars for hipcc in an inconsistent way. Removing the tags in src/CMakeLists.txt resolves this issue for me.

--- rocALUTION-rocm-5.5.1/src/CMakeLists.txt.bak    2023-05-27 18:12:40.853740486 +0200
+++ rocALUTION-rocm-5.5.1/src/CMakeLists.txt    2023-05-27 18:12:49.834025607 +0200
@@ -115,11 +115,6 @@

 # Create rocALUTION hip library
 if(SUPPORT_HIP)
-  # Flag source file as a hip source file
-  foreach(i ${HIP_SOURCES})
-    set_source_files_properties(${i} PROPERTIES HIP_SOURCE_PROPERTY_FORMAT TRUE)
-  endforeach()
-
   # HIP flags workaround while target_compile_options do not work
   # list(APPEND HIP_HIPCC_FLAGS "-O3 -march=native -Wno-unused-command-line-argument -fPIC -std=c++14")
   list(APPEND HIP_HIPCC_FLAGS "-O3 -Wno-unused-command-line-argument -fPIC -std=c++14")

I couldn't find any documentation for HIP_SOURCE_PROPERTY_FORMAT, and it isn't used in similar libraries like rocBLAS or rocSOLVER. It's only mentioned in a comment on building with rocBLAS with CUDA

# ########################################################################
# NOTE:  CUDA compiling path
# ########################################################################
# I have tried compiling rocBLAS library source with multiple methods,
# and ended up using the approach where we set the CXX compiler to hipcc.
# I didn't like using the HIP_ADD_LIBRARY or CUDA_ADD_LIBRARY approaches,
# for the reasons I list here.
# 1.  Adding header include directories is through HIP_INCLUDE_DIRECTORIES(), which
# is global to a directory and affects all targets
# 2.  You must add HIP_SOURCE_PROPERTY_FORMAT OBJ properties to .cpp files
# to get HIP_ADD_LIBRARY to recognize the file
# 3.  HIP_ADD_LIBRARY invokes a call to add_custom_command() to compile files,
# and rocBLAS does the same.  The order in which custom commands execute is
# undefined, and sometimes a file is attempted to be compiled before it has
# been generated.  The fix for this is to create 'PHONY' targets, which I
# don't desire.
cgmb commented 1 year ago

Hi @tpkessler, I'm not an expert on the rocALUTION build system, but my understanding is that it uses FindHIP.cmake because when MPI is enabled, rocALUTION needs to build some files as C++ and other files as HIP. The HIP_SOURCE_PROPERTY_FORMAT property is used in FindHIP.cmake to distinguish the two. If you remove those lines, I think you can build everything with hipcc, but only if you don't enable MPI.

To me, it sounds like the real problem is whatever the code that is setting that incorrect path to clang. When looking through this history of FindHIP.cmake, I noticed a commit titled "Change to get correct clang path in findhip". I wonder if backporting that change would fix the path to clang? https://github.com/ROCm-Developer-Tools/HIP/commit/e15925e91350b23f7f9a2ef46073f6910219e0cc

These are just my initial thoughts before we properly debug the problem. Once we reproduce the issue and debug, it should be possible to answer with more confidence.

tpkessler commented 1 year ago

Hi Cory, thanks for looking into my issue. The linked line is a good starting point. I recompiled rocalution with the patched hip but I get the same error messages.

I think you can build everything with hipcc, but only if you don't enable MPI.

I could build rocalution with MPI and the patch without problems.

ntrost57 commented 1 year ago

Closing this issue. Thanks @cgmb

cgmb commented 1 year ago

@ntrost57, I don't think I've resolved their problem.

tpkessler commented 1 year ago

No, the problem still exists.

cgmb commented 1 year ago

It's merely a workaround, but I've added an option to build without FindHIP.cmake in 5bd014fa8c71ed743876eef5ae4171a1cfdc715c. It relies on CMake's HIP language support instead, and can be enabled with -DUSE_HIPCXX=ON.

Note that you will need to specify your build flags differently if you use that option (e.g., CXXFLAGS vs HIPFLAGS) and some of the default CMake options may differ as well (e.g., AMDGPU_TARGETS vs. CMAKE_HIP_ARCHITECTURES). They will require some adjustments to your build scripts and that is why it is an opt-in flag.

ppanchad-amd commented 3 weeks ago

@tpkessler Please check if your issue still exists with the latest ROCM 6.1.2? If not, please close the ticket. Thanks!