ROCm / Tensile

Stretching GPU performance for GEMMs and tensor contractions.
MIT License
208 stars 142 forks source link

Move from hipcc to amdclang #1920

Closed ellosel closed 2 months ago

ellosel commented 3 months ago

The goal of this change set is to move the default compiler from hipcc to amdclang++. The most significant difference between hipcc and amdclang are the flags used when invoking the compiler wrapped by hipcc. We attempt to delineate those differences below by noting if a flag is defaulted (d), set manually (m) or unset (blank):

Flag hipcc amdclang Notes
-O3 d m amdclang is O2 by default
-x hip d m Failure to compile if unset
-D__HIP_HCC_COMPAT_MODE__=1 d m DecisionTree test segfaults when unset
-mrelax-all d
-mframe-pointer=all d
-mframe-pointer=none d
-Wno-format-nonliteral d
-fallow-half-arguments-and-returns d
-mllvm -amdgpu-early-inline-all=true d
-mllvm -amdgpu-function-calls=false d
--genco m
--cuda-device-only d m This is set for hipcc when genco is passed

To reproduce one can do the following:

Configure the HostTestLibrary directory

# hipcc
cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_COMPILER=hipcc -DTensile_CPU_THREADS=32 -DTensile_ROOT=`pwd`/Tensile -S HostLibraryTests -B build

# amdclang++
cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_COMPILER=/opt/rocm/bin/amdclang++ -DTensile_CPU_THREADS=32 -DTensile_ROOT=`pwd`/Tensile -S HostLibraryTests -B build/host3 -DCMAKE_CXX_FLAGS="-D__HIP_HCC_COMPAT_MODE__=1"

Run the build

cmake --build build/host -j 32

Remove DecisionTree test binaries, rebuild with verbosity and review the history for build command

rm build/host/CMakeFiles/TensileTest.dir/DecisionTree_test.cpp.o
cmake --build build/host --verbose

Run the build command from above with -v

cd build/host
/opt/rocm/bin/amdclang++ -v <args...> CMakeFiles/TensileTests.dir/DecisionTree_test.cpp.o -MF CMakeFiles/TensileTests.dir/DecisionTree_test.cpp.o.d -o CMakeFiles/TensileTests.dir/DecisionTree_test.cpp.o -c /mnt/host/projects/tensile-wts/clang-migration/HostLibraryTests/DecisionTree_test.cpp

The command above will output the detailed compilation commands and arguments used to determine differences in compilation with hipcc and amdclang++.