ROCm / hipBLASLt

hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditional BLAS library
https://rocm.docs.amd.com/projects/hipBLASLt/en/latest/index.html
MIT License
63 stars 88 forks source link

[Issue]: The semantic of `hipblasLtMatmulAlgoGetHeuristic` is ambiguous and cause undeterministic runtime. #1113

Closed whchung closed 2 months ago

whchung commented 2 months ago

Problem Description

In the documentation of hipblasLtMatmulAlgoGetHeuristic, it is not obvious about the time it would take to complete the API.

It has been empirically identified that if returnAlgoCount exceeds the size of pre-determined solutions, the internal function getAllSolutions will be invoked, which causes undeterministic run time in the production applications:

https://github.com/ROCm/hipBLASLt/blob/develop/library/src/amd_detail/rocblaslt/src/rocblaslt_auxiliary.cpp#L1364-L1369

Please consider revise the documentation or revise the implementation so the run time of the API is more deterministic.

Operating System

UB 22.04

CPU

x86

GPU

AMD Instinct MI250X

Other

No response

ROCm Version

ROCm 5.7.1

ROCm Component

No response

Steps to Reproduce

No response

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

KKyang commented 2 months ago

https://github.com/ROCm/hipBLASLt/pull/1116