hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditional BLAS library
This PR adds a compression layer to all final code objects, thereby generating smaller libraries at the expense of build time. Includes minor refactoring.
Outcomes:
A new clang-offload-bundler invocation is added after assembly object linking.
getAssemblyCodeObjectFiles has been renamed to buildAssemblyCodeObjectFiles to match the name of source kernel functions.
Build
Time
Build Size (build/library/ directory)
Feature, gfx90a
8m38.902s
269M
Develop, gfx90a
8m3.921s
812M
Feature, gfx90a
2.4G
Develop, gfx90a
11G
Testing and Environment:
Docker: Ubuntu 24.04, ROCm 6.4 RC stack, AMD clang version 18.0.0, AMD clang-offload-bundler version 18.0.0
Summary:
This PR adds a compression layer to all final code objects, thereby generating smaller libraries at the expense of build time. Includes minor refactoring.
Outcomes:
getAssemblyCodeObjectFiles
has been renamed tobuildAssemblyCodeObjectFiles
to match the name of source kernel functions.Testing and Environment:
Docker: Ubuntu 24.04, ROCm 6.4 RC stack, AMD clang version 18.0.0, AMD clang-offload-bundler version 18.0.0
Tested with hipBLASLt bench and test clients