Closed Madouura closed 1 year ago
@Madouura as you are using a custom build process I am trying to diagnose potential issues. Am I correct in understanding you use gfx803 arbitrarily just to speed up the build as you don't actually use it on a gfx803. As gfx803 is not really supported but just a legacy arch I would suggest you use gfx1030 as the test GPU arch as from your graphics cards it appears you could build and test with the clients using that arch if you desired. You are missing some flags so always best to use install.sh for any new release version and match the cmake flags: -DTensile_LOGIC=asm_full -DTensile_CODE_OBJECT_VERSION=V3 -DTensile_SEPARATE_ARCHITECTURES=ON -DTensile_LAZY_LIBRARY_LOADING=ON -DTensile_LIBRARY_FORMAT=msgpack where the bold ones you were missing, and then add any specific add ons you find you need for your configuration such as the -DTENSILE_VENV_UPGRADE_PIP=ON etc.
Buiding your style I do see: ./Tensile/library/TensileLibrary_lazy_gfx803.dat and Kernels.so-000-gfx803.hsaco but I would stop using gfx803 and I would suggest you try with gfx1030 and test the build, otherwise attach a build log if you use all the cmake arguments I have shown above and you still find the build does not work.
Yes, the gfx803 is just to speed up. Before that I was doing it with gfx1030 (my actual GPU) for speedy results, and before that I was doing the default "all". With "all", I found issues with gfx90a (xnack on and off) giving out errors with DGEMM_something_ALDEBERAN_something, but I don't use that GPU and didn't find it relevant enough to make an issue about yet. The errors specifically IIRC were about gfx90a+ not working with gfx900. I'll rerun it if you find it relevant. I did those flags at first also with no observable difference from now, but here's a rerun: (7294a708d91a68e6179d6d1bc74b254686ee8030 5.3.0 due to the rest of the stack not being 5.3.1 on my system yet)
Also reran on the develop branch as a sanity check, same issue.
Do you have -DROCM_PATH=/nix/store/h31x77g46c1sp8fzq2dm61m85h5npj3l-hip-5.3.0
Not sure if cmake is messed, but use the -DROCM_PATH and even
export ROCM_PATH=/nix/store/h31x77g46c1sp8fzq2dm61m85h5npj3l-hip-5.3.0 if that is where your full rocm is installed. Why is it called hip-5.3.0 and not rocm?
This should not be seen " warning: ISA: (10, 3, 0) is not supported; overriding with (9, 0, 0) " Will verify the python3.10.7 support Do you have your rocm install path + /bin + path+/llvm/bin and path+hip/bin in your PATH var? No other hipcc or clang installed first in your path? I would try install.sh again if it is easy to remove the lines that don't work on your OS, otherwise do the above suggestions. May need a VERBOSE=1 and capture the large log file but see why it doesn't generate code objects. I will ask someone on the Tensile side to advise further on what might cause the failure to generate code objects. You can just do the post config build with make instead of the second cmake in case cmake is somehow involved.
Tried just with make, same result.
Tried with -DROCM_PATH=/nix/store/h31x77g46c1sp8fzq2dm61m85h5npj3l-hip-5.3.0
and the export, same result.
Tried with copying every known rocBLAS dependency's package root to a folder, adding that to PATH as well as the cmake option, new result with new errors.
This error is due to the package's fix for a cmake issue, see: https://github.com/ROCm-Developer-Tools/hipamd/issues/55
It worked?! Finally!
Now we're at the errors I had while disabling Tensile
.
If I figure anything out regarding that I'll post it here, or if it's not a nix compatibility issue, if you like, in a new issue.
I suspect it may be a issue that can be "fixed" in a similar manner as this: default.nix
Why is it called hip-5.3.0 and not rocm?
In nix
, individual packages are created as derivations under (usually) /nix/store
, the hip
package does not contain all the other ROCm packages but may link to them with symlinks.
I forgot to switch back to the release revision. Same thing, but here's the log
The issue seems to be in library/src/include/utility.hpp
-> ${hip}/include/hip/hip_runtime.h
-> possibly ${hip}/include/hip/amd_detail/amd_hip_runtime.h
.
Tried passing __HIP_PLATFORM_AMD__
and -D__HIP_PLATFORM_AMD__
to CXXFLAGS
, no effect.
export ROCM_PATH="/nix/store/j22ngkly46a8ny0cmsdsd55iv0fq4s42-rocm-cmake-5.3.0:/nix/store/cyldr9i1kk0n0qjwf2zz15cw2vb1s8c5-rocm-runtime-5.3.0:/nix/store/kn3j6izgl1sjzlmslhp655h7md92cmhr-rocm-device-libs-5.3.0:/nix/store/fnw5z5qdcpywx4cjx4cj2wxn8bwr36xy-rocm-comgr-5.3.0:/nix/store/rizslsfck4q3lg1kx28mrnjpbh9lfvmi-rocminfo-5.3.0:/nix/store/h31x77g46c1sp8fzq2dm61m85h5npj3l-hip-5.3.0:/nix/store/7hwb5wpi1ig027yglfavm068ph6ijdlc-rocm-llvm-5.3.0"
The above allowed me to get to the same point without having to do the copy shenanigans.
After further testing, only need to add /nix/store/7hwb5wpi1ig027yglfavm068ph6ijdlc-rocm-llvm-5.3.0
to -DROCM_PATH
and add ${llvm}/bin
, ${llvm}/hip/bin
, and ${llvm}/llvm/bin
to the beginning of the PATH
.
Good news on the Aldebaran front (custom kernels, apparently), those are now generated.
Okay yes you are past all tensile issues and will look into the cause of:
In file included from /home/mado/Downloads/rocBLAS/library/src/blas_ex/rocblas_nrm2_ex.hpp:25:
/home/mado/Downloads/rocBLAS/library/src/blas_ex/../blas1/rocblas_nrm2.hpp:30:17: error: reference to host function 'norm
Here's the relevant BUILD_VERBOSE output:
-- rocfft_VERSION:
-- ==>CMAKE_BUILD_TYPE: Release
-- ==>BUILD_SHARED_LIBS: ON
-- ==>ROCM_PATH link: /nix/store/mijv88c7rfj5nna4pyb8dp9x4p9y9yfk-hip-5.3.1
-- ==>CMAKE_INSTALL_PREFIX link: /nix/store/apbgkvlpgkpn6r96cnw903szbdj325xk-rocblas-2.45.0-5.3.1
-- ==>CMAKE_MODULE_PATH link: /build/source/cmake/nix/store/ld1rigs4zmccf2vhxb9kd1jc54wjk40y-rocm-cmake-5.3.1/share/rocm/cmake/build/source/tensile/lib/python3.10/site-packages/Tensile/Source/cmake//build/source/tensile/lib/python3.10/site-packages/Tensile/Source//build/source/cmake
-- ==>CMAKE_PREFIX_PATH link: /llvm/hip/var/empty/rocm/llvm/var/empty/rocm/var/empty/rocm/hip/build/source/tensile
-- ==>CPACK_PACKAGING_INSTALL_PREFIX link:
-- ==============
-- ==>CMAKE_CXX_COMPILER: -D__HIP_HCC_COMPAT_MODE__=1
-- ==>CMAKE_CXX_COMPILER debug: -g
-- ==>CMAKE_CXX_COMPILER release: -O3 -DNDEBUG
-- ==>CMAKE_CXX_COMPILER relwithdebinfo: -O2 -g -DNDEBUG
-- ==>CMAKE_EXE_LINKER_FLAGS:
-- ==>CMAKE_EXE_LINKER_FLAGS_RELEASE:
-- ==>CMAKE_SHARED_LINKER_FLAGS:
-- ==>CMAKE_SHARED_LINKER_FLAGS_RELEASE:
-- ==============
-- ==>CMAKE_SHARED_LIBRARY_C_FLAGS:
-- ==>CMAKE_SHARED_LIBRARY_CXX_FLAGS: -fPIC
-- ==>CMAKE_SHARED_LINKER_FLAGS:
-- ==>CMAKE_SHARED_LINKER_FLAGS_DEBUG:
-- ==>CMAKE_SHARED_LINKER_FLAGS_RELEASE:
I did try adding hip's include directory as the first entry to: https://github.com/ROCmSoftwarePlatform/rocBLAS/blob/9882cea588902b75a6ce2cf2fdf58891ca6246b3/library/src/CMakeLists.txt#L416-L425 No dice, unfortunately. There are a lot of other things concerning cmake I've tried too, but I can't seem to get rid of this toolchain mixing. I did also make sure that llvm and clang were included, along with other things. I'll see what I can do with includes in the source.
Also tried this and making sure the HIP path was passed to the compiler with -I
:
substituteInPlace library/src/include/utility.hpp \
--replace "<hip/hip_runtime.h>" "\"${hip}/include/hip/hip_runtime.h\""
substituteInPlace library/src/include/handle.hpp \
--replace "<hip/hip_runtime.h>" "\"${hip}/include/hip/hip_runtime.h\""
substituteInPlace library/include/internal/rocblas_bfloat16.h \
--replace "<hip/hip_runtime.h>" "\"${hip}/include/hip/hip_runtime.h\""
substituteInPlace library/include/internal/rocblas-complex-types.h \
--replace "<hip/hip_complex.h>" "\"${hip}/include/hip/hip_complex.h\"" \
--replace "<hip/hip_runtime.h>" "\"${hip}/include/hip/hip_runtime.h\""
substituteInPlace library/src/blas1/rocblas_reduction.hpp \
--replace "<hip/hip_runtime.h>" "\"${hip}/include/hip/hip_runtime.h\""
Interestingly, the internal includes get replaced with the original during the build phase, but even after editing those I get the same error.
Not clear to me why hipcc isn't behaving the same in your case yet. export HIPCC_VERBOSE=7 and add to the make command VERBOSE=1
The include search path would be of interest to see the section after
to see where the path /nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/include/c++/11.3.0/ comes in the search ordering. The clang/x.y.z/include/cuda_wrappers I think has to come before your c++/11.3.0 so there are host and device forms of the complex norm. This will show you too much information to paste but it should indicate the include search paths and hint at the problem.
I've refactored the code to hopefully avoid ambiguity for this template but it won't be of use until some future release, you can find it in develop branch commit: 44b99c6df26002139ca9ec68ee1fc8899c7b001f You could apply as a patch as a test only to see if this avoids the problem.
Unmodified with VERBOSE variables: rocblas.log
Sections of interest:
[ 51%] Building CXX object library/src/CMakeFiles/rocblas.dir/blas2/rocblas_ger_strided_batched.cpp.o
clang version 15.0.0
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /nix/store/66nlmbkqslw49yk6v2751p427fmqgrgq-rocm-llvm-5.3.1/bin
Found candidate GCC installation: /nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib/gcc/x86_64-unknown-linux-gnu/11.3.0
Found candidate GCC installation: /nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0
Selected GCC installation: /nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0
Candidate multilib: .;@m64
Selected multilib: .;@m64
Found HIP installation: /nix/store/mijv88c7rfj5nna4pyb8dp9x4p9y9yfk-hip-5.3.1, version 5.3.22062
"/nix/store/66nlmbkqslw49yk6v2751p427fmqgrgq-rocm-llvm-5.3.1/bin/clang-15" -cc1 -triple amdgcn-amd-amdhsa -aux-triple x86_64-unknown-linux-gnu -emit-obj --mrelax-relocations -disable-free -clear-ast-before-backend -disable-llvm-verifier -discard-value-names -main-file-name rocblas_ger_kernels.cpp -mrelocation-model pic -pic-level 2 -fhalf-no-semantic-interposition -mframe-pointer=none -fno-rounding-math -mconstructor-aliases -aux-target-cpu x86-64 -aux-target-feature +f16c -fcuda-is-device -mllvm -amdgpu-internalize-symbols -fcuda-allow-variadic-functions -mlink-builtin-bitcode /nix/store/vcgmg54a72rmz1kz46dy890bagkn1b7y-rocm-device-libs-5.3.1/amdgcn/bitcode/hip.bc -mlink-builtin-bitcode /nix/store/vcgmg54a72rmz1kz46dy890bagkn1b7y-rocm-device-libs-5.3.1/amdgcn/bitcode/ocml.bc -mlink-builtin-bitcode /nix/store/vcgmg54a72rmz1kz46dy890bagkn1b7y-rocm-device-libs-5.3.1/amdgcn/bitcode/ockl.bc -mlink-builtin-bitcode /nix/store/vcgmg54a72rmz1kz46dy890bagkn1b7y-rocm-device-libs-5.3.1/amdgcn/bitcode/oclc_daz_opt_off.bc -mlink-builtin-bitcode /nix/store/vcgmg54a72rmz1kz46dy890bagkn1b7y-rocm-device-libs-5.3.1/amdgcn/bitcode/oclc_unsafe_math_off.bc -mlink-builtin-bitcode /nix/store/vcgmg54a72rmz1kz46dy890bagkn1b7y-rocm-device-libs-5.3.1/amdgcn/bitcode/oclc_finite_only_off.bc -mlink-builtin-bitcode /nix/store/vcgmg54a72rmz1kz46dy890bagkn1b7y-rocm-device-libs-5.3.1/amdgcn/bitcode/oclc_correctly_rounded_sqrt_on.bc -mlink-builtin-bitcode /nix/store/vcgmg54a72rmz1kz46dy890bagkn1b7y-rocm-device-libs-5.3.1/amdgcn/bitcode/oclc_wavefrontsize64_off.bc -mlink-builtin-bitcode /nix/store/vcgmg54a72rmz1kz46dy890bagkn1b7y-rocm-device-libs-5.3.1/amdgcn/bitcode/oclc_isa_version_1030.bc -mlink-builtin-bitcode /nix/store/vcgmg54a72rmz1kz46dy890bagkn1b7y-rocm-device-libs-5.3.1/amdgcn/bitcode/oclc_abi_version_400.bc -target-cpu gfx1030 -mllvm -treat-scalable-fixed-error-as-warning -debugger-tuning=gdb -v -resource-dir /nix/store/69g2qyb4xgmiah4n8z7sfy34vg5fza89-rocm-llvm-wrapper-5.3.1/resource-root -dependency-file CMakeFiles/rocblas.dir/blas2/rocblas_ger_kernels.cpp.o.d -MT library/src/CMakeFiles/rocblas.dir/blas2/rocblas_ger_kernels.cpp.o -sys-header-deps -internal-isystem /nix/store/69g2qyb4xgmiah4n8z7sfy34vg5fza89-rocm-llvm-wrapper-5.3.1/resource-root/include/cuda_wrappers -idirafter /nix/store/mijv88c7rfj5nna4pyb8dp9x4p9y9yfk-hip-5.3.1/include -include __clang_hip_runtime_wrapper.h -isystem /nix/store/66nlmbkqslw49yk6v2751p427fmqgrgq-rocm-llvm-5.3.1/lib/clang/15.0.0/include/.. -isystem /nix/store/hl5h7vvpgzxw8938fkzxm9gxcxbv7n54-rocm-runtime-5.3.1/include -isystem /nix/store/hl5h7vvpgzxw8938fkzxm9gxcxbv7n54-rocm-runtime-5.3.1/include -isystem /nix/store/mijv88c7rfj5nna4pyb8dp9x4p9y9yfk-hip-5.3.1/include -isystem /nix/store/p33hpnqy7qqbd6lrfyysgfi2m3a7k68i-hip-5.3.1/include -idirafter /nix/store/ybkyabc23chdfy48n3h1zqwa57vp38wd-glibc-2.35-163-dev/include -isystem /nix/store/mijv88c7rfj5nna4pyb8dp9x4p9y9yfk-hip-5.3.1/include -isystem /nix/store/66nlmbkqslw49yk6v2751p427fmqgrgq-rocm-llvm-5.3.1/include -isystem /nix/store/1l5gp04v4jvc7wr8dyv12jy8mszma9w0-ncurses-6.3-p20220507-dev/include -isystem /nix/store/smfnf8ayl3473bqlhwizl9r18rphydjp-zlib-1.2.12-dev/include -isystem /nix/store/b5rhlkj0b25yp21gm7l5karjmxh6mqhk-rocm-comgr-5.3.1/include -isystem /nix/store/hl5h7vvpgzxw8938fkzxm9gxcxbv7n54-rocm-runtime-5.3.1/include -isystem /nix/store/ydj66x49m6f4ikaw4b2hlkdfxd5vdf6s-rocm-thunk-5.3.1/include -isystem /nix/store/fkcl1wzq3106qqgl84bhgk1lp56q6bzg-python3-3.10.7/include -isystem /nix/store/5bj9nfnm8ql21zcsfca48z5rglhpbd4p-msgpack-3.3.0/include -isystem /nix/store/zfs5p0dxvqkyws7kvbl82b3lym44cg9i-libxml2-2.10.2-dev/include -isystem /nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/include/c++/11.3.0 -isystem /nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/include/c++/11.3.0/x86_64-unknown-linux-gnu -D BUILD_WITH_TENSILE=1 -D ROCBLAS_INTERNAL_API -D ROCM_USE_FLOAT16 -D TENSILE_DEFAULT_SERIALIZATION -D TENSILE_MSGPACK=1 -D TENSILE_USE_HIP -D __HIP_PLATFORM_AMD__=1 -D __HIP_PLATFORM_HCC__=1 -D rocblas_EXPORTS -I /build/source/library/include -I /build/source/library/include/internal -I /build/source/library/src/include -I /build/source/build/include/rocblas/internal -I /build/source/build/include/rocblas -I /build/source/build/include -I /build/source/library/src/blas3/Tensile -I /build/source/tensile/lib/python3.10/site-packages/Tensile/Source/lib/include -D __HIP_HCC_COMPAT_MODE__=1 -D NDEBUG -internal-isystem /nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0/../../../../include/c++/11.3.0 -internal-isystem /nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0/../../../../include/c++/11.3.0/x86_64-unknown-linux-gnu -internal-isystem /nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0/../../../../include/c++/11.3.0/backward -internal-isystem /nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0/../../../../include/c++/11.3.0 -internal-isystem /nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0/../../../../include/c++/11.3.0/x86_64-unknown-linux-gnu -internal-isystem /nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0/../../../../include/c++/11.3.0/backward -internal-isystem /nix/store/69g2qyb4xgmiah4n8z7sfy34vg5fza89-rocm-llvm-wrapper-5.3.1/resource-root/include -internal-isystem /usr/local/include -internal-isystem /nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0/../../../../x86_64-unknown-linux-gnu/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem /nix/store/69g2qyb4xgmiah4n8z7sfy34vg5fza89-rocm-llvm-wrapper-5.3.1/resource-root/include -internal-isystem /usr/local/include -internal-isystem /nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0/../../../../x86_64-unknown-linux-gnu/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -O3 -Wno-unused-result -Werror=vla -std=c++17 -fdeprecated-macro -fno-autolink -fdebug-compilation-dir=/build/source/build/library/src -ferror-limit 19 -fvisibility hidden -fvisibility-inlines-hidden -fhip-new-launch-api -fgnuc-version=4.2.1 -no-opaque-pointers -fcxx-exceptions -fexceptions -fcolor-diagnostics -vectorize-loops -vectorize-slp -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -cuid=7feb3acf29116291 -fcuda-allow-variadic-functions -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -o /build/rocblas_ger_kernels-de023d/rocblas_ger_kernels-gfx1030.o -x hip /build/source/library/src/blas2/rocblas_ger_kernels.cpp
clang -cc1 version 15.0.0 based upon LLVM 15.0.0git default target x86_64-unknown-linux-gnu
ignoring nonexistent directory "/usr/local/include"
ignoring nonexistent directory "/nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0/../../../../x86_64-unknown-linux-gnu/include"
ignoring nonexistent directory "/include"
ignoring nonexistent directory "/usr/include"
ignoring nonexistent directory "/usr/local/include"
ignoring nonexistent directory "/nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0/../../../../x86_64-unknown-linux-gnu/include"
ignoring nonexistent directory "/include"
ignoring nonexistent directory "/usr/include"
ignoring nonexistent directory "/usr/local/include"
ignoring nonexistent directory "/usr/include"
ignoring duplicate directory "/nix/store/hl5h7vvpgzxw8938fkzxm9gxcxbv7n54-rocm-runtime-5.3.1/include"
ignoring duplicate directory "/nix/store/mijv88c7rfj5nna4pyb8dp9x4p9y9yfk-hip-5.3.1/include"
ignoring duplicate directory "/nix/store/hl5h7vvpgzxw8938fkzxm9gxcxbv7n54-rocm-runtime-5.3.1/include"
ignoring duplicate directory "/nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/include/c++/11.3.0"
ignoring duplicate directory "/nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/include/c++/11.3.0/x86_64-unknown-linux-gnu"
ignoring duplicate directory "/nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/include/c++/11.3.0"
ignoring duplicate directory "/nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/include/c++/11.3.0/x86_64-unknown-linux-gnu"
ignoring duplicate directory "/nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0/../../../../include/c++/11.3.0/backward"
ignoring duplicate directory "/nix/store/69g2qyb4xgmiah4n8z7sfy34vg5fza89-rocm-llvm-wrapper-5.3.1/resource-root/include"
ignoring duplicate directory "/nix/store/69g2qyb4xgmiah4n8z7sfy34vg5fza89-rocm-llvm-wrapper-5.3.1/resource-root/include"
ignoring duplicate directory "/nix/store/mijv88c7rfj5nna4pyb8dp9x4p9y9yfk-hip-5.3.1/include"
#include "..." search starts here:
#include <...> search starts here:
/build/source/library/include
/build/source/library/include/internal
/build/source/library/src/include
/build/source/build/include/rocblas/internal
/build/source/build/include/rocblas
/build/source/build/include
/build/source/library/src/blas3/Tensile
/build/source/tensile/lib/python3.10/site-packages/Tensile/Source/lib/include
/nix/store/66nlmbkqslw49yk6v2751p427fmqgrgq-rocm-llvm-5.3.1/lib/clang/15.0.0/include/..
/nix/store/hl5h7vvpgzxw8938fkzxm9gxcxbv7n54-rocm-runtime-5.3.1/include
/nix/store/mijv88c7rfj5nna4pyb8dp9x4p9y9yfk-hip-5.3.1/include
/nix/store/p33hpnqy7qqbd6lrfyysgfi2m3a7k68i-hip-5.3.1/include
/nix/store/66nlmbkqslw49yk6v2751p427fmqgrgq-rocm-llvm-5.3.1/include
/nix/store/1l5gp04v4jvc7wr8dyv12jy8mszma9w0-ncurses-6.3-p20220507-dev/include
/nix/store/smfnf8ayl3473bqlhwizl9r18rphydjp-zlib-1.2.12-dev/include
/nix/store/b5rhlkj0b25yp21gm7l5karjmxh6mqhk-rocm-comgr-5.3.1/include
/nix/store/ydj66x49m6f4ikaw4b2hlkdfxd5vdf6s-rocm-thunk-5.3.1/include
/nix/store/fkcl1wzq3106qqgl84bhgk1lp56q6bzg-python3-3.10.7/include
/nix/store/5bj9nfnm8ql21zcsfca48z5rglhpbd4p-msgpack-3.3.0/include
/nix/store/zfs5p0dxvqkyws7kvbl82b3lym44cg9i-libxml2-2.10.2-dev/include
/nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/include/c++/11.3.0
/nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/include/c++/11.3.0/x86_64-unknown-linux-gnu
/nix/store/69g2qyb4xgmiah4n8z7sfy34vg5fza89-rocm-llvm-wrapper-5.3.1/resource-root/include/cuda_wrappers
/nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0/../../../../include/c++/11.3.0/backward
/nix/store/69g2qyb4xgmiah4n8z7sfy34vg5fza89-rocm-llvm-wrapper-5.3.1/resource-root/include
/nix/store/ybkyabc23chdfy48n3h1zqwa57vp38wd-glibc-2.35-163-dev/include
End of search list.
[ 53%] Building CXX object library/src/CMakeFiles/rocblas.dir/blas2/rocblas_hemv_batched.cpp.o
"/nix/store/66nlmbkqslw49yk6v2751p427fmqgrgq-rocm-llvm-5.3.1/bin/lld" -flavor gnu --no-undefined -shared -plugin-opt=-amdgpu-internalize-symbols -plugin-opt=mcpu=gfx1030 -plugin-opt=O3 -plugin-opt=no-opaque-pointers -plugin-opt=-amdgpu-early-inline-all=true -plugin-opt=-amdgpu-function-calls=false -o /build/rocblas_ger_kernels-81d002/rocblas_ger_kernels-gfx1030.out /build/rocblas_ger_kernels-de023d/rocblas_ger_kernels-gfx1030.o
"/nix/store/66nlmbkqslw49yk6v2751p427fmqgrgq-rocm-llvm-5.3.1/bin/clang-offload-bundler" -type=o -bundle-align=4096 -targets=host-x86_64-unknown-linux,hipv4-amdgcn-amd-amdhsa--gfx1030 -input=/dev/null -input=/build/rocblas_ger_kernels-81d002/rocblas_ger_kernels-gfx1030.out -output=/build/rocblas_ger_kernels-5d7861.hipfb
"/nix/store/66nlmbkqslw49yk6v2751p427fmqgrgq-rocm-llvm-5.3.1/bin/clang-15" -cc1 -triple x86_64-unknown-linux-gnu -aux-triple amdgcn-amd-amdhsa -emit-obj --mrelax-relocations -disable-free -clear-ast-before-backend -disable-llvm-verifier -discard-value-names -main-file-name rocblas_ger_kernels.cpp -mrelocation-model pic -pic-level 2 -fhalf-no-semantic-interposition -mframe-pointer=none -fmath-errno -fno-rounding-math -mconstructor-aliases -funwind-tables=2 -target-cpu x86-64 -target-feature +f16c -tune-cpu generic -mllvm -treat-scalable-fixed-error-as-warning -debugger-tuning=gdb -v -fcoverage-compilation-dir=/build/source/build/library/src -resource-dir /nix/store/69g2qyb4xgmiah4n8z7sfy34vg5fza89-rocm-llvm-wrapper-5.3.1/resource-root -dependency-file CMakeFiles/rocblas.dir/blas2/rocblas_ger_kernels.cpp.o.d -MT library/src/CMakeFiles/rocblas.dir/blas2/rocblas_ger_kernels.cpp.o -sys-header-deps -internal-isystem /nix/store/69g2qyb4xgmiah4n8z7sfy34vg5fza89-rocm-llvm-wrapper-5.3.1/resource-root/include/cuda_wrappers -idirafter /nix/store/mijv88c7rfj5nna4pyb8dp9x4p9y9yfk-hip-5.3.1/include -include __clang_hip_runtime_wrapper.h -isystem /nix/store/66nlmbkqslw49yk6v2751p427fmqgrgq-rocm-llvm-5.3.1/lib/clang/15.0.0/include/.. -isystem /nix/store/hl5h7vvpgzxw8938fkzxm9gxcxbv7n54-rocm-runtime-5.3.1/include -isystem /nix/store/hl5h7vvpgzxw8938fkzxm9gxcxbv7n54-rocm-runtime-5.3.1/include -isystem /nix/store/mijv88c7rfj5nna4pyb8dp9x4p9y9yfk-hip-5.3.1/include -isystem /nix/store/p33hpnqy7qqbd6lrfyysgfi2m3a7k68i-hip-5.3.1/include -idirafter /nix/store/ybkyabc23chdfy48n3h1zqwa57vp38wd-glibc-2.35-163-dev/include -isystem /nix/store/mijv88c7rfj5nna4pyb8dp9x4p9y9yfk-hip-5.3.1/include -isystem /nix/store/66nlmbkqslw49yk6v2751p427fmqgrgq-rocm-llvm-5.3.1/include -isystem /nix/store/1l5gp04v4jvc7wr8dyv12jy8mszma9w0-ncurses-6.3-p20220507-dev/include -isystem /nix/store/smfnf8ayl3473bqlhwizl9r18rphydjp-zlib-1.2.12-dev/include -isystem /nix/store/b5rhlkj0b25yp21gm7l5karjmxh6mqhk-rocm-comgr-5.3.1/include -isystem /nix/store/hl5h7vvpgzxw8938fkzxm9gxcxbv7n54-rocm-runtime-5.3.1/include -isystem /nix/store/ydj66x49m6f4ikaw4b2hlkdfxd5vdf6s-rocm-thunk-5.3.1/include -isystem /nix/store/fkcl1wzq3106qqgl84bhgk1lp56q6bzg-python3-3.10.7/include -isystem /nix/store/5bj9nfnm8ql21zcsfca48z5rglhpbd4p-msgpack-3.3.0/include -isystem /nix/store/zfs5p0dxvqkyws7kvbl82b3lym44cg9i-libxml2-2.10.2-dev/include -isystem /nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/include/c++/11.3.0 -isystem /nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/include/c++/11.3.0/x86_64-unknown-linux-gnu -D BUILD_WITH_TENSILE=1 -D ROCBLAS_INTERNAL_API -D ROCM_USE_FLOAT16 -D TENSILE_DEFAULT_SERIALIZATION -D TENSILE_MSGPACK=1 -D TENSILE_USE_HIP -D __HIP_PLATFORM_AMD__=1 -D __HIP_PLATFORM_HCC__=1 -D rocblas_EXPORTS -I /build/source/library/include -I /build/source/library/include/internal -I /build/source/library/src/include -I /build/source/build/include/rocblas/internal -I /build/source/build/include/rocblas -I /build/source/build/include -I /build/source/library/src/blas3/Tensile -I /build/source/tensile/lib/python3.10/site-packages/Tensile/Source/lib/include -D __HIP_HCC_COMPAT_MODE__=1 -D NDEBUG -internal-isystem /nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0/../../../../include/c++/11.3.0 -internal-isystem /nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0/../../../../include/c++/11.3.0/x86_64-unknown-linux-gnu -internal-isystem /nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0/../../../../include/c++/11.3.0/backward -internal-isystem /nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0/../../../../include/c++/11.3.0 -internal-isystem /nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0/../../../../include/c++/11.3.0/x86_64-unknown-linux-gnu -internal-isystem /nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0/../../../../include/c++/11.3.0/backward -internal-isystem /nix/store/69g2qyb4xgmiah4n8z7sfy34vg5fza89-rocm-llvm-wrapper-5.3.1/resource-root/include -internal-isystem /usr/local/include -internal-isystem /nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0/../../../../x86_64-unknown-linux-gnu/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem /nix/store/69g2qyb4xgmiah4n8z7sfy34vg5fza89-rocm-llvm-wrapper-5.3.1/resource-root/include -internal-isystem /usr/local/include -internal-isystem /nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0/../../../../x86_64-unknown-linux-gnu/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -O3 -Wno-unused-result -Werror=vla -std=c++17 -fdeprecated-macro -fdebug-compilation-dir=/build/source/build/library/src -ferror-limit 19 -fvisibility hidden -fvisibility-inlines-hidden -fhip-new-launch-api -fgnuc-version=4.2.1 -no-opaque-pointers -fcxx-exceptions -fexceptions -fcolor-diagnostics -vectorize-loops -vectorize-slp -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcuda-include-gpubinary /build/rocblas_ger_kernels-5d7861.hipfb -cuid=7feb3acf29116291 -fcuda-allow-variadic-functions -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -o CMakeFiles/rocblas.dir/blas2/rocblas_ger_kernels.cpp.o -x hip /build/source/library/src/blas2/rocblas_ger_kernels.cpp
clang -cc1 version 15.0.0 based upon LLVM 15.0.0git default target x86_64-unknown-linux-gnu
ignoring nonexistent directory "/usr/local/include"
ignoring nonexistent directory "/nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0/../../../../x86_64-unknown-linux-gnu/include"
ignoring nonexistent directory "/include"
ignoring nonexistent directory "/usr/include"
ignoring nonexistent directory "/usr/local/include"
ignoring nonexistent directory "/nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0/../../../../x86_64-unknown-linux-gnu/include"
ignoring nonexistent directory "/include"
ignoring nonexistent directory "/usr/include"
ignoring duplicate directory "/nix/store/hl5h7vvpgzxw8938fkzxm9gxcxbv7n54-rocm-runtime-5.3.1/include"
ignoring duplicate directory "/nix/store/mijv88c7rfj5nna4pyb8dp9x4p9y9yfk-hip-5.3.1/include"
ignoring duplicate directory "/nix/store/hl5h7vvpgzxw8938fkzxm9gxcxbv7n54-rocm-runtime-5.3.1/include"
ignoring duplicate directory "/nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/include/c++/11.3.0"
ignoring duplicate directory "/nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/include/c++/11.3.0/x86_64-unknown-linux-gnu"
ignoring duplicate directory "/nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/include/c++/11.3.0"
ignoring duplicate directory "/nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/include/c++/11.3.0/x86_64-unknown-linux-gnu"
ignoring duplicate directory "/nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0/../../../../include/c++/11.3.0/backward"
ignoring duplicate directory "/nix/store/69g2qyb4xgmiah4n8z7sfy34vg5fza89-rocm-llvm-wrapper-5.3.1/resource-root/include"
ignoring duplicate directory "/nix/store/mijv88c7rfj5nna4pyb8dp9x4p9y9yfk-hip-5.3.1/include"
#include "..." search starts here:
#include <...> search starts here:
/build/source/library/include
/build/source/library/include/internal
/build/source/library/src/include
/build/source/build/include/rocblas/internal
/build/source/build/include/rocblas
/build/source/build/include
/build/source/library/src/blas3/Tensile
/build/source/tensile/lib/python3.10/site-packages/Tensile/Source/lib/include
/nix/store/66nlmbkqslw49yk6v2751p427fmqgrgq-rocm-llvm-5.3.1/lib/clang/15.0.0/include/..
/nix/store/hl5h7vvpgzxw8938fkzxm9gxcxbv7n54-rocm-runtime-5.3.1/include
/nix/store/mijv88c7rfj5nna4pyb8dp9x4p9y9yfk-hip-5.3.1/include
/nix/store/p33hpnqy7qqbd6lrfyysgfi2m3a7k68i-hip-5.3.1/include
/nix/store/66nlmbkqslw49yk6v2751p427fmqgrgq-rocm-llvm-5.3.1/include
/nix/store/1l5gp04v4jvc7wr8dyv12jy8mszma9w0-ncurses-6.3-p20220507-dev/include
/nix/store/smfnf8ayl3473bqlhwizl9r18rphydjp-zlib-1.2.12-dev/include
/nix/store/b5rhlkj0b25yp21gm7l5karjmxh6mqhk-rocm-comgr-5.3.1/include
/nix/store/ydj66x49m6f4ikaw4b2hlkdfxd5vdf6s-rocm-thunk-5.3.1/include
/nix/store/fkcl1wzq3106qqgl84bhgk1lp56q6bzg-python3-3.10.7/include
/nix/store/5bj9nfnm8ql21zcsfca48z5rglhpbd4p-msgpack-3.3.0/include
/nix/store/zfs5p0dxvqkyws7kvbl82b3lym44cg9i-libxml2-2.10.2-dev/include
/nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/include/c++/11.3.0
/nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/include/c++/11.3.0/x86_64-unknown-linux-gnu
/nix/store/69g2qyb4xgmiah4n8z7sfy34vg5fza89-rocm-llvm-wrapper-5.3.1/resource-root/include/cuda_wrappers
/nix/store/acbklvmaxi32lj3f7k1m1y00017f89ix-gcc-11.3.0/lib64/gcc/x86_64-unknown-linux-gnu/11.3.0/../../../../include/c++/11.3.0/backward
/nix/store/69g2qyb4xgmiah4n8z7sfy34vg5fza89-rocm-llvm-wrapper-5.3.1/resource-root/include
/nix/store/ybkyabc23chdfy48n3h1zqwa57vp38wd-glibc-2.35-163-dev/include
End of search list.
The GCC toolchain does seem to get included before some of the llvm/clang includes.
The llvm-wrapper
is clang.
Otherwise, applying your commit as a patch does fix everything.
Well the patch should be safe as passed all internal tests, but /nix/store/69g2qyb4xgmiah4n8z7sfy34vg5fza89-rocm-llvm-wrapper-5.3.1/resource-root/include/cuda_wrappers seems late. Not sure why is it llvm-wrapper vs. rocm-llvm-5.3.1 include did you had to modify the rocm llvm installation ?
Here is a 5.3 search ubuntu20
/var/tmp/rocBLAS/library/include /var/tmp/rocBLAS/library/include/internal /var/tmp/rocBLAS/library/src/include /var/tmp/rocBLAS/build/release/include/rocblas/internal /var/tmp/rocBLAS/build/release/include/rocblas /var/tmp/rocBLAS/build/release/include /opt/rocm-5.3.0/llvm/lib/clang/15.0.0/include/.. /opt/rocm-5.3.0/hsa/include /opt/rocm-5.3.0/include /opt/rocm-5.3.0/llvm/lib/clang/15.0.0/include/cuda_wrappers /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9 /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/x86_64-linux-gnu/c++/9 /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/backward /opt/rocm-5.3.0/llvm/lib/clang/15.0.0/include /usr/local/include /usr/include/x86_64-linux-gnu /usr/include
So this is a non-release later build but also an example more expected search order on Ubuntu22 as reference.
/rocBLAS-internal/library/include /rocBLAS-internal/library/include/internal /rocBLAS-internal/library/src/include rocBLAS-internal/build/release/include/rocblas/internal /rocBLAS-internal/build/release/include/rocblas /rocBLAS-internal/build/release/include /opt/rocm-5.4.0-10923/llvm/lib/clang/15.0.0/include/.. /opt/rocm-5.4.0-10923/hsa/include /opt/rocm-5.4.0-10923/include /opt/rocm-5.4.0-10923/llvm/lib/clang/15.0.0/include/cuda_wrappers /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11 /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/x86_64-linux-gnu/c++/11 /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/backward /opt/rocm-5.4.0-10923/llvm/lib/clang/15.0.0/include /usr/local/include /usr/include/x86_64-linux-gnu /usr/include End of search list.
There is a ingoring duplicates step done by the compiler that may be affecting order. You could see if you can force your llvm wrapper include earlier and may not need the patch (which would be expected in rocm5.5 no ETA)
The latest llvm aside from my update PR is in https://github.com/NixOS/nixpkgs/blob/842a2c2399ae1557d4543a76d65b900d9ffe8d52/pkgs/development/compilers/llvm/rocm/llvm.nix and the directory it resides in.
I didn't make it so this may be a question better directed to acowley, lovesegfault, or Flakebi. I'll ping them on the tracking issue and link them here.
This may also solve some issues with rocThrust
, which I suspect can be solved in a similar manner as your patch or correct ordering on our side.
Looks like it was a faulty toolchain include order on our side after all!
I rewrote the rocm-llvm
derivation among other things and now we don't even need the patch.
Thank you for your help.
Describe the bug
rocBLAS or Tensile will not correctly generate .co and .dat files when building. (TensileCreateLibraryFiles)
To Reproduce
7294a708d91a68e6179d6d1bc74b254686ee8030 (Also latest develop and a few release versions back) Steps to reproduce the behavior:
nix-shell -I nixpkgs=/home/mado/Documents/Development/nixpkgs -p stdenv cmake rocm-cmake rocm-runtime rocm-device-libs rocm-comgr hip llvmPackages.openmp llvmPackages.openmp python3 python3Packages.pyyaml python3Packages.msgpack boost.dev llvmPackages_rocm.clang llvmPackages_rocm.llvm msgpack libxml2 python3Packages.wheel gtest python3Packages.pandas
CXX=hipcc CC=hipcc FC=gfortran cmake -DCMAKE_C_COMPILER=hipcc -DCMAKE_CXX_COMPILER=hipcc -DCMAKE_FC_COMPILER=gfortran -Dpython=python3 -DCMAKE_BUILD_TYPE=Release -DAMDGPU_TARGETS=gfx803 -DTENSILE_VENV_UPGRADE_PIP=ON -DTensile_LAZY_LIBRARY_LOADING=ON ..
CXX=hipcc CC=hipcc FC=gfortran cmake --build . -j 32
Expected behavior
When using
BUILD_WITH_TENSILE
, the .co and .dat files inbuild/Tensile/library
should be present.Log-files
Several variations of cmake config command as well as architectures used, using gfx803 for rapid testing.
Configure:
```console rocBLAS/build (7294a70) [$] via △ v3.24.2 via ❄️ impure (shell) took 23s ❯ CXX=hipcc CC=hipcc FC=gfortran cmake -DCMAKE_C_COMPILER=hipcc -DCMAKE_CXX_COMPILER=hipcc -DCMAKE_FC_COMPILER=gfortran -Dpython=python3 -DCMAKE_BUILD_TYPE=Release -DAMDGPU_TARGETS=gfx803 -DTENSILE_VENV_UPGRADE_PIP=ON -DTensile_LAZY_LIBRARY_LOADING=ON .. -- Use hip-clang to build for amdgpu backend -- OS detected is nixos /nix/store/fkcl1wzq3106qqgl84bhgk1lp56q6bzg-python3-3.10.7/bin/python3 -m venv /home/mado/Downloads/rocBLAS/build/virtualenv --system-site-packages --clear virtualenv python version: /home/mado/Downloads/rocBLAS/build/virtualenv/bin/python3 Python 3.10.7 /home/mado/Downloads/rocBLAS/build/virtualenv/bin/python3 -m pip install --upgrade pip Requirement already satisfied: pip in ./virtualenv/lib/python3.10/site-packages (22.2.2) Collecting pip Using cached pip-22.3-py3-none-any.whl (2.1 MB) Installing collected packages: pip Attempting uninstall: pip Found existing installation: pip 22.2.2 Uninstalling pip-22.2.2: Successfully uninstalled pip-22.2.2 Successfully installed pip-22.3 /home/mado/Downloads/rocBLAS/build/virtualenv/bin/python3 -m pip install git+https://github.com/ROCmSoftwarePlatform/Tensile.git@b33ca97af456cda14f7b1ec9bcc8aeab3ed6dd08 Collecting git+https://github.com/ROCmSoftwarePlatform/Tensile.git@b33ca97af456cda14f7b1ec9bcc8aeab3ed6dd08 Cloning https://github.com/ROCmSoftwarePlatform/Tensile.git (to revision b33ca97af456cda14f7b1ec9bcc8aeab3ed6dd08) to /run/user/1000/pip-req-build-jrmhw2bh Running command git clone --filter=blob:none --quiet https://github.com/ROCmSoftwarePlatform/Tensile.git /run/user/1000/pip-req-build-jrmhw2bh Running command git rev-parse -q --verify 'sha^b33ca97af456cda14f7b1ec9bcc8aeab3ed6dd08' Running command git fetch -q https://github.com/ROCmSoftwarePlatform/Tensile.git b33ca97af456cda14f7b1ec9bcc8aeab3ed6dd08 Running command git checkout -q b33ca97af456cda14f7b1ec9bcc8aeab3ed6dd08 Resolved https://github.com/ROCmSoftwarePlatform/Tensile.git to commit b33ca97af456cda14f7b1ec9bcc8aeab3ed6dd08 Running command git submodule update --init --recursive -q Preparing metadata (setup.py): started Preparing metadata (setup.py): finished with status 'done' Requirement already satisfied: pyyaml in /nix/store/hfd2mlapz5ljh0y0qckkr0x560f2zdhf-python3.10-pyyaml-6.0/lib/python3.10/site-packages (from Tensile==4.34.0) (6.0) Requirement already satisfied: msgpack in /nix/store/570ldhh0fsq1fnfr9k0n6qn5kkx9nsnc-python3.10-msgpack-1.0.4/lib/python3.10/site-packages (from Tensile==4.34.0) (1.0.4) Building wheels for collected packages: Tensile Building wheel for Tensile (setup.py): started Building wheel for Tensile (setup.py): finished with status 'done' Created wheel for Tensile: filename=Tensile-4.34.0-py3-none-any.whl size=4624539 sha256=ce9066ab9234b353505905c66ef0fa0d6e73a9a43e7bf65b9123231850d342ec Stored in directory: /home/mado/.cache/pip/wheels/55/bb/d4/8e72757a3bad7dbfa969bc15f51aac656b67abc84c4b04a45a Successfully built Tensile Installing collected packages: Tensile Successfully installed Tensile-4.34.0 -- using GIT Tensile fork=ROCmSoftwarePlatform from branch=b33ca97af456cda14f7b1ec9bcc8aeab3ed6dd08 -- Adding /home/mado/Downloads/rocBLAS/build/virtualenv to CMAKE_PREFIX_PATH -- hip::amdhip64 is SHARED_LIBRARY -- hip::amdhip64 is SHARED_LIBRARY -- Using AMDGPU_TARGETS: gfx803 -- Tensile script: /home/mado/Downloads/rocBLAS/build/virtualenv/lib/python3.10/site-packages/Tensile/bin/TensileCreateLibrary -- Tensile_CREATE_COMMAND: /home/mado/Downloads/rocBLAS/build/virtualenv/lib/python3.10/site-packages/Tensile/bin/TensileCreateLibrary;--merge-files;--separate-architectures;--lazy-library-loading;--no-short-file-names;--no-library-print-debug;--code-object-version=V3;--cxx-compiler=hipcc;--library-format=msgpack;--architecture=gfx803;/home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full;/home/mado/Downloads/rocBLAS/build/Tensile;HIP -- Tensile_MANIFEST_FILE_PATH: /home/mado/Downloads/rocBLAS/build/Tensile/library/TensileManifest.txt '/home/mado/Downloads/rocBLAS/build/virtualenv/lib/python3.10/site-packages/Tensile/bin/TensileCreateLibrary' '--merge-files' '--separate-architectures' '--lazy-library-loading' '--no-short-file-names' '--no-library-print-debug' '--code-object-version=V3' '--cxx-compiler=hipcc' '--library-format=msgpack' '--architecture=gfx803' '/home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full' '/home/mado/Downloads/rocBLAS/build/Tensile' 'HIP' '--generate-manifest-and-exit' ################################################################################ # Tensile Create Library # Detected local GPU with ISA: gfx1030 # Detected local GPU with ISA: gfx1030 cap gfx000 gfx803 gfx900 gfx906 gfx908 gfx90a gfx1010 gfx1011 gfx1012 gfx1030 gfx1100 gfx1101 gfx1102 HasMFMA_bf16_1k 0 0 0 0 0 0 0 0 0 0 0 0 0 HasAddLshl 0 0 0 0 0 0 0 0 0 0 0 0 0 HasAtomicAdd 0 0 0 0 0 0 0 0 0 0 0 0 0 HasCodeObjectV3 0 0 0 0 0 0 0 0 0 0 0 0 0 HasDirectToLds 0 0 0 0 0 0 0 0 0 0 0 0 0 HasExplicitCO 0 0 0 0 0 0 0 0 0 0 0 0 0 HasExplicitNC 0 0 0 0 0 0 0 0 0 0 0 0 0 HasLshlOr 0 0 0 0 0 0 0 0 0 0 0 0 0 HasMFMA 0 0 0 0 0 0 0 0 0 0 0 0 0 HasSMulHi 0 0 0 0 0 0 0 0 0 0 0 0 0 MaxLgkmcnt 1 1 1 1 1 1 1 1 1 1 1 1 1 MaxVmcnt 0 0 0 0 0 0 0 0 0 0 0 0 0 SupportedISA 0 0 0 0 0 0 0 0 0 0 0 0 0 SupportedSource 1 1 1 1 1 1 1 1 1 1 1 1 1 v_dot2_f32_f16 0 0 0 0 0 0 0 0 0 0 0 0 0 v_dot2c_f32_f16 0 0 0 0 0 0 0 0 0 0 0 0 0 v_fma_f16 0 0 0 0 0 0 0 0 0 0 0 0 0 v_fmac_f16 0 0 0 0 0 0 0 0 0 0 0 0 0 v_mac_f16 0 0 0 0 0 0 0 0 0 0 0 0 0 v_pk_fma_f16 0 0 0 0 0 0 0 0 0 0 0 0 0 v_pk_fmac_f16 0 0 0 0 0 0 0 0 0 0 0 0 0 v_fma_f32 0 0 0 0 0 0 0 0 0 0 0 0 0 v_fma_mix_f32 0 0 0 0 0 0 0 0 0 0 0 0 0 v_fmac_f32 0 0 0 0 0 0 0 0 0 0 0 0 0 v_mac_f32 0 0 0 0 0 0 0 0 0 0 0 0 0 v_mad_mix_f32 0 0 0 0 0 0 0 0 0 0 0 0 0 HasMFMA_f64 0 0 0 0 0 0 0 0 0 0 0 0 0 v_fma_f64 0 0 0 0 0 0 0 0 0 0 0 0 0 VOP3v_dot4_i32_i8 0 0 0 0 0 0 0 0 0 0 0 0 0 v_dot4_i32_i8 0 0 0 0 0 0 0 0 0 0 0 0 0 v_dot4c_i32_i8 0 0 0 0 0 0 0 0 0 0 0 0 0 ArchAccUnifiedRegs 0 0 0 0 0 1 0 0 0 0 0 0 0 CMPXWritesSGPR 1 1 1 1 1 1 0 0 0 0 0 0 0 HasAccCD 0 0 0 0 0 1 0 0 0 0 0 0 0 HasEccHalf 0 0 0 1 1 1 0 0 0 0 0 0 0 HasWave32 0 0 0 0 0 0 1 1 1 1 1 1 1 SeparateVscnt 0 0 0 0 0 0 1 1 1 1 1 1 1 Waitcnt0Disabled 0 0 0 0 1 1 0 0 0 0 0 0 0 # Found hipcc version 5.3.22061- # CodeObjectVersion from TensileCreateLibrary: V3 # CxxCompiler from TensileCreateLibrary: hipcc # Architecture from TensileCreateLibrary: gfx803 # LibraryFormat from TensileCreateLibrary: msgpack # LibraryLogicFiles: # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_BjlkC_CB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_BjlkC_CB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_BjlkC_ZB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_BjlkC_ZB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_4xi8II_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_4xi8II_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_BBS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_BBS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_BSS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_BSS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_CB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_CB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_DB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_DB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_HB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_HB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_HHS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_HHS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_HSS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_HSS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_I8II_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_I8II_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_SB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_SB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_ZB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_ZB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_4xi8II_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_4xi8II_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_BBS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_BBS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_BSS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_BSS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_CB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_CB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_DB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_DB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_HB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_HB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_HHS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_HHS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_HSS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_HSS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_I8II_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_I8II_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_SB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_SB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_ZB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_ZB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_AlikC_BjlkC_CB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_AlikC_BjlkC_CB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_AlikC_BjlkC_ZB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_AlikC_BjlkC_ZB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_AlikC_Bjlk_CB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_AlikC_Bjlk_CB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_AlikC_Bjlk_ZB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_AlikC_Bjlk_ZB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_AlikC_Bljk_CB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_AlikC_Bljk_CB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_AlikC_Bljk_ZB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_AlikC_Bljk_ZB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_BjlkC_CB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_BjlkC_CB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_BjlkC_ZB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_BjlkC_ZB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_4xi8II_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_4xi8II_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_BBS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_BBS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_BSS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_BSS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_CB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_CB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_DB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_DB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_HB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_HB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_HHS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_HHS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_HSS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_HSS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_I8II_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_I8II_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_SB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_SB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_ZB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_ZB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_4xi8II_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_4xi8II_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_BBS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_BBS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_BSS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_BSS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_CB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_CB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_DB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_DB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_HB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_HB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_HHS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_HHS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_HSS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_HSS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_I8II_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_I8II_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_SB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_SB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_ZB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_ZB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/r9nano_Cijk_Ailk_Bjlk_DB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/r9nano_Cijk_Ailk_Bjlk_SB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/r9nano_Cijk_Ailk_Bljk_DB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/r9nano_Cijk_Ailk_Bljk_SB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/r9nano_Cijk_Alik_Bjlk_DB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/r9nano_Cijk_Alik_Bjlk_SB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/r9nano_Cijk_Alik_Bljk_DB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/r9nano_Cijk_Alik_Bljk_SB.yaml Reading logic files: Launching 32 threads... Reading logic files: Done. [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||] 100% (0.7 secs elapsed) -- *** NOTE: blas2/rocblas_ger_kernels.cpp is compiled with the verbose flag -v for QC purposes. -- Configuring done -- Generating done -- Build files have been written to: /home/mado/Downloads/rocBLAS/build ```Build:
```console rocBLAS/build (7294a70) [$] via △ v3.24.2 via ❄️ impure (shell) took 22s ❯ CXX=hipcc CC=hipcc FC=gfortran cmake --build . -j 32 [ 0%] Built target rocblas_device_malloc [ 1%] Built target rocblas_proto_templates [ 1%] Generating Tensile Libraries ################################################################################ # Tensile Create Library # Detected local GPU with ISA: gfx1030 # Detected local GPU with ISA: gfx1030 cap gfx000 gfx803 gfx900 gfx906 gfx908 gfx90a gfx1010 gfx1011 gfx1012 gfx1030 gfx1100 gfx1101 gfx1102 HasMFMA_bf16_1k 0 0 0 0 0 0 0 0 0 0 0 0 0 HasAddLshl 0 0 0 0 0 0 0 0 0 0 0 0 0 HasAtomicAdd 0 0 0 0 0 0 0 0 0 0 0 0 0 HasCodeObjectV3 0 0 0 0 0 0 0 0 0 0 0 0 0 HasDirectToLds 0 0 0 0 0 0 0 0 0 0 0 0 0 HasExplicitCO 0 0 0 0 0 0 0 0 0 0 0 0 0 HasExplicitNC 0 0 0 0 0 0 0 0 0 0 0 0 0 HasLshlOr 0 0 0 0 0 0 0 0 0 0 0 0 0 HasMFMA 0 0 0 0 0 0 0 0 0 0 0 0 0 HasSMulHi 0 0 0 0 0 0 0 0 0 0 0 0 0 MaxLgkmcnt 1 1 1 1 1 1 1 1 1 1 1 1 1 MaxVmcnt 0 0 0 0 0 0 0 0 0 0 0 0 0 SupportedISA 0 0 0 0 0 0 0 0 0 0 0 0 0 SupportedSource 1 1 1 1 1 1 1 1 1 1 1 1 1 v_dot2_f32_f16 0 0 0 0 0 0 0 0 0 0 0 0 0 v_dot2c_f32_f16 0 0 0 0 0 0 0 0 0 0 0 0 0 v_fma_f16 0 0 0 0 0 0 0 0 0 0 0 0 0 v_fmac_f16 0 0 0 0 0 0 0 0 0 0 0 0 0 v_mac_f16 0 0 0 0 0 0 0 0 0 0 0 0 0 v_pk_fma_f16 0 0 0 0 0 0 0 0 0 0 0 0 0 v_pk_fmac_f16 0 0 0 0 0 0 0 0 0 0 0 0 0 v_fma_f32 0 0 0 0 0 0 0 0 0 0 0 0 0 v_fma_mix_f32 0 0 0 0 0 0 0 0 0 0 0 0 0 v_fmac_f32 0 0 0 0 0 0 0 0 0 0 0 0 0 v_mac_f32 0 0 0 0 0 0 0 0 0 0 0 0 0 v_mad_mix_f32 0 0 0 0 0 0 0 0 0 0 0 0 0 HasMFMA_f64 0 0 0 0 0 0 0 0 0 0 0 0 0 v_fma_f64 0 0 0 0 0 0 0 0 0 0 0 0 0 VOP3v_dot4_i32_i8 0 0 0 0 0 0 0 0 0 0 0 0 0 v_dot4_i32_i8 0 0 0 0 0 0 0 0 0 0 0 0 0 v_dot4c_i32_i8 0 0 0 0 0 0 0 0 0 0 0 0 0 ArchAccUnifiedRegs 0 0 0 0 0 1 0 0 0 0 0 0 0 CMPXWritesSGPR 1 1 1 1 1 1 0 0 0 0 0 0 0 HasAccCD 0 0 0 0 0 1 0 0 0 0 0 0 0 HasEccHalf 0 0 0 1 1 1 0 0 0 0 0 0 0 HasWave32 0 0 0 0 0 0 1 1 1 1 1 1 1 SeparateVscnt 0 0 0 0 0 0 1 1 1 1 1 1 1 Waitcnt0Disabled 0 0 0 0 1 1 0 0 0 0 0 0 0 # Found hipcc version 5.3.22061- # CodeObjectVersion from TensileCreateLibrary: V3 # CxxCompiler from TensileCreateLibrary: hipcc # Architecture from TensileCreateLibrary: gfx803 # LibraryFormat from TensileCreateLibrary: msgpack # LibraryLogicFiles: # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_BjlkC_CB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_BjlkC_CB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_BjlkC_ZB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_BjlkC_ZB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_4xi8II_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_4xi8II_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_BBS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_BBS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_BSS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_BSS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_CB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_CB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_DB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_DB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_HB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_HB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_HHS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_HHS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_HSS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_HSS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_I8II_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_I8II_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_SB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_SB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_ZB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bjlk_ZB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_4xi8II_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_4xi8II_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_BBS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_BBS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_BSS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_BSS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_CB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_CB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_DB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_DB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_HB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_HB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_HHS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_HHS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_HSS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_HSS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_I8II_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_I8II_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_SB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_SB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_ZB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Ailk_Bljk_ZB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_AlikC_BjlkC_CB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_AlikC_BjlkC_CB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_AlikC_BjlkC_ZB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_AlikC_BjlkC_ZB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_AlikC_Bjlk_CB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_AlikC_Bjlk_CB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_AlikC_Bjlk_ZB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_AlikC_Bjlk_ZB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_AlikC_Bljk_CB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_AlikC_Bljk_CB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_AlikC_Bljk_ZB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_AlikC_Bljk_ZB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_BjlkC_CB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_BjlkC_CB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_BjlkC_ZB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_BjlkC_ZB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_4xi8II_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_4xi8II_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_BBS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_BBS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_BSS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_BSS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_CB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_CB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_DB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_DB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_HB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_HB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_HHS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_HHS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_HSS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_HSS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_I8II_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_I8II_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_SB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_SB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_ZB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bjlk_ZB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_4xi8II_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_4xi8II_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_BBS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_BBS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_BSS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_BSS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_CB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_CB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_DB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_DB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_HB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_HB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_HHS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_HHS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_HSS_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_HSS_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_I8II_BH.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_I8II_BH_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_SB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_SB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_ZB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/hip_Cijk_Alik_Bljk_ZB_GB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/r9nano_Cijk_Ailk_Bjlk_DB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/r9nano_Cijk_Ailk_Bjlk_SB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/r9nano_Cijk_Ailk_Bljk_DB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/r9nano_Cijk_Ailk_Bljk_SB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/r9nano_Cijk_Alik_Bjlk_DB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/r9nano_Cijk_Alik_Bjlk_SB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/r9nano_Cijk_Alik_Bljk_DB.yaml # /home/mado/Downloads/rocBLAS/library/src/blas3/Tensile/Logic/asm_full/r9nano_Cijk_Alik_Bljk_SB.yaml Reading logic files: Launching 32 threads... Reading logic files: Done. [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||] 100% (0.7 secs elapsed) # Writing Custom CMake # Writing Kernels... Generating kernels: Launching 32 threads... warning: ISA: (8, 0, 3) is not supported; overriding with (9, 0, 0) ... omitted duplicates ... warning: ISA: (8, 0, 3) is not supported; overriding with (9, 0, 0) Generating kernels: Done. * Compiling source kernels: Launching 32 threads... Compiling source kernels: Done. # Kernel Building elapsed time = 12.2 secs Traceback (most recent call last): File "/home/mado/Downloads/rocBLAS/build/library/src/../../virtualenv/lib/python3.10/site-packages/Tensile/bin/TensileCreateLibrary", line 43, inEnvironment
See: https://github.com/NixOS/nixpkgs/search?q=rocm&type=
environment.txt
Additional context
Trying to port as many ROCm packages to nixpkgs as I can: https://github.com/NixOS/nixpkgs/issues/197885 I don't think this is a Tensile issue, because looking at stable-diffusion-webui, using pytorch, I do see the correct generated .co and .dat files. I also tried a slightly modified
install.sh
with defaults to make sure it wasn't a special value being set, unfortunately it gives me the same problem.List of files that ARE generated:
```console TensileManifest.txt TensileLibrary_Type_DD_Contraction_l_Ailk_Bljk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_DD_Contraction_l_Ailk_Bjlk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_SS_Contraction_l_Ailk_Bljk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_ZZ_Contraction_l_AlikC_Bljk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_ZZ_Contraction_l_Ailk_Bjlk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_CC_Contraction_l_AlikC_Bljk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_ZZ_Contraction_l_Ailk_BjlkC_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_ZZ_Contraction_l_AlikC_Bjlk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_CC_Contraction_l_Alik_BjlkC_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_CC_Contraction_l_Ailk_Bjlk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_ZZ_Contraction_l_AlikC_BjlkC_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_CC_Contraction_l_Ailk_BjlkC_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_CC_Contraction_l_AlikC_BjlkC_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_CC_Contraction_l_Ailk_Bljk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_CC_Contraction_l_AlikC_Bjlk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_ZZ_Contraction_l_Ailk_Bljk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_ZZ_Contraction_l_Alik_BjlkC_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_SS_Contraction_l_Ailk_Bjlk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_HH_Contraction_l_Ailk_Bljk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_DD_Contraction_l_Alik_Bjlk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_DD_Contraction_l_Alik_Bljk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_CC_Contraction_l_Alik_Bjlk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_SS_Contraction_l_Alik_Bjlk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_CC_Contraction_l_Alik_Bljk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_ZZ_Contraction_l_Alik_Bjlk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_BS_HPA_Contraction_l_Alik_Bjlk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_HS_HPA_Contraction_l_Alik_Bjlk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_HH_HPA_Contraction_l_Alik_Bjlk_Cijk_Dijk_fallback_gfx803.hsaco Kernels.so-000-gfx803.hsaco TensileLibrary_Type_HH_Contraction_l_Alik_Bljk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_SS_Contraction_l_Alik_Bljk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_ZZ_Contraction_l_Alik_Bljk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_HH_Contraction_l_Alik_Bjlk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_HH_HPA_Contraction_l_Ailk_Bljk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_BB_HPA_Contraction_l_Alik_Bjlk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_HS_HPA_Contraction_l_Ailk_Bljk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_HH_HPA_Contraction_l_Alik_Bljk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_HH_Contraction_l_Ailk_Bjlk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_BS_HPA_Contraction_l_Alik_Bljk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_HS_HPA_Contraction_l_Alik_Bljk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_BS_HPA_Contraction_l_Ailk_Bljk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_4xi8I_HPA_Contraction_l_Ailk_Bljk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_I8I_HPA_Contraction_l_Ailk_Bljk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_HH_HPA_Contraction_l_Ailk_Bjlk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_HS_HPA_Contraction_l_Ailk_Bjlk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_4xi8I_HPA_Contraction_l_Alik_Bljk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_BB_HPA_Contraction_l_Alik_Bljk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_BB_HPA_Contraction_l_Ailk_Bljk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_4xi8I_HPA_Contraction_l_Ailk_Bjlk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_4xi8I_HPA_Contraction_l_Alik_Bjlk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_BB_HPA_Contraction_l_Ailk_Bjlk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_I8I_HPA_Contraction_l_Alik_Bljk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_I8I_HPA_Contraction_l_Ailk_Bjlk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_BS_HPA_Contraction_l_Ailk_Bjlk_Cijk_Dijk_fallback_gfx803.hsaco TensileLibrary_Type_I8I_HPA_Contraction_l_Alik_Bjlk_Cijk_Dijk_fallback_gfx803.hsaco ```