Open junliume opened 3 weeks ago
@illsilin our CI should have one stage with GPU_TARGETS of "gfx1100;gfx90a;gfx942" :)
GPU_TARGETS="gfx1100;gfx90a;gfx942"
How to reproduce:
CXX=/opt/rocm/bin/amdclang++ cmake -DCMAKE_PREFIX_PATH=/opt/rocm -DCMAKE_BUILD_TYPE=Release -DGPU_TARGETS="gfx1100;gfx90a;gfx942" ..
More problem fixed in https://github.com/ROCm/composable_kernel/pull/1372/commits/4b81c7a5ae42985f3b07df6bfe3c5c9b8ddddb1a
Hence GPU_TARGETS MATCHES
is very problematic because it find matches only, excluding other targets based on match is very fragile.
FYI: additional issues are found when building client_example
with multiple targets:
cd ${composable_kernel}/client_example/build
CXX=/opt/rocm/llvm/bin/clang++ cmake -DCMAKE_PREFIX_PATH="${composable_kernel}/install/;/opt/rocm/" -DCMAKE_BUILD_TYPE=release -DGPU_TARGETS="gfx1100;gfx90a" -DCMAKE_CXX_FLAGS=" -O3 " ..
make -j$(nproc)
will cause problems
/data/driver/composable_kernel/install/include/ck/tensor_operation/gpu/block/blockwise_gemm_xdlops.hpp:171:57: note: expression evaluates to '256 == 128'
171 | static_assert(ThisThreadBlock::GetNumOfThread() == MWaves * NWaves * WaveSize,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated when compiling for gfx1100.
because this above should not be compiled for gfx1100
targets.
meanwhile
https://github.com/ROCm/composable_kernel/blob/fix_1371/client_example/25_wrapper/CMakeLists.txt#L5-L10
is problematic because there could be multiple targets by default or such as GPU_TARGETS="gfx1100;gfx90a"
https://github.com/ROCm/composable_kernel/blob/a90bfa9857da5cc35a9c5dc1f068b538a1e64c9b/include/ck/utility/amd_smfmac.hpp#L19
in #1309 This instruction should be built for only gfx94 platforms