ROCm / hipBLASLt

hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditional BLAS library
https://rocm.docs.amd.com/projects/hipBLASLt/en/latest/index.html
MIT License
64 stars 89 forks source link

Code object compression via bundling #1374

Open bstefanuk opened 5 days ago

bstefanuk commented 5 days ago

Summary:

This PR adds a compression layer to all final code objects, thereby generating smaller libraries at the expense of build time. Includes minor refactoring.

Outcomes:

Build Time Build Size (build/library/ directory)
Feature, gfx90a 8m38.902s 269M
Develop, gfx90a 8m3.921s 812M
Feature, gfx90a 2.4G
Develop, gfx90a 11G

Testing and Environment:

Docker: Ubuntu 24.04, ROCm 6.4 RC stack, AMD clang version 18.0.0, AMD clang-offload-bundler version 18.0.0

Tested with hipBLASLt bench and test clients