ROCm / hipBLASLt

hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditional BLAS library
https://rocm.docs.amd.com/projects/hipBLASLt/en/latest/index.html
MIT License
49 stars 80 forks source link

Tune Aquavanjaram942X Grid sizes for HHS TN #1096

Closed aferoz21 closed 2 weeks ago

aferoz21 commented 2 weeks ago

1) Overall ~4% improvement on the weighted time for the ~10 tuned sizes. 2) New kernels not added but updated the kernels for existing grid points.

nakajee commented 2 weeks ago

Please run hipblaslt-test on your local node (with the change) and paste the result here if you put noCI label.

nakajee commented 2 weeks ago

I think 128x256 (or 256x128) is better... Maybe due to a bug with generator. Would you try again with the latest one? Or, manually add 128x256,256x128 (or double of M or N side for other than 128x128)

jichangjichang commented 2 weeks ago

If this PR is necessary for 6.3, please remember file PR to release-staging/rocm-rel-6.3 as well.

aferoz21 commented 2 weeks ago

I think 128x256 (or 256x128) is better... Maybe due to a bug with generator. Would you try again with the latest one? Or, manually add 128x256,256x128 (or double of M or N side for other than 128x128) Yes. 128x256 or 256x128 picked for most sizes and there is a small improvement.

aferoz21 commented 2 weeks ago

Please run hipblaslt-test on your local node (with the change) and paste the result here if you put noCI label.

[----------] 1 test from ExtOpTest/ExtOpAMaxWithScaleUnsupportedDatatypeTest [ RUN ] ExtOpTest/ExtOpAMaxWithScaleUnsupportedDatatypeTest.amaxWithScaleFailureUnsupportedDatatype/0 [ OK ] ExtOpTest/ExtOpAMaxWithScaleUnsupportedDatatypeTest.amaxWithScaleFailureUnsupportedDatatype/0 (0 ms) [----------] 1 test from ExtOpTest/ExtOpAMaxWithScaleUnsupportedDatatypeTest (0 ms total)

[----------] Global test environment tear-down [==========] 48206 tests from 13 test suites ran. (2096949 ms total) [ PASSED ] 48206 tests. hipBLASLt version: 1000 command line: ./hipblaslt-test