ROCm / rocBLAS

Next generation BLAS implementation for ROCm platform
https://rocm.docs.amd.com/projects/rocBLAS/en/latest/
Other
336 stars 157 forks source link

HOTFIX: updating aquavanjaram942X libraries #1391

Closed babakpst closed 6 months ago

babakpst commented 6 months ago

merging library tuning from develop

TorreZuk commented 6 months ago

@babakpst @amcamd looks like Tensile went ahead with all commits not just fixes so there is a mismatch. If you have to bring all rocBLAS forward to make this we just need to make the PR and see if anything needs to be removed other than versions that need backed out after bringing forward. rocBLAS dev to staging promotion passed so that is a good sign, but other components are the risk as we just had solver break created and fixed just this week.

babakpst commented 6 months ago

We had to merge all Tensile commits from develop, but for rocBLAS,,I only merged the library. The rocblas tag in this branch matches with Tensile tag now. Is there anything else that needs to change?

TorreZuk commented 6 months ago

Well the compilation seems to be failing in Tensile. Not familiar with the errors. Develop has all passed so that is why I assumed their is now inconsistency in this mixture. Can discuss Monday

nakajee commented 6 months ago

It looks like the following commit is not merged. https://github.com/ROCm/rocBLAS/commit/4ed12a268b492c8c008dfaeba8fd7131601536bd

Not sure if this is the only missing commit, but at least missing this commit causes rocblas build error with the latest tensile_tag.

nakajee commented 6 months ago

I am not sure the requirement but don't we need to merge changes in aquavanjaram942 folder which is used for 942X? For example, the following commit is listed up, but F8 change in aquavanjaram942 folder was not merged in. aquavanjaram942 tuning for BBS+HHS+F8 NT (#2222)

nakajee commented 6 months ago

Also this one. aquavanjaram942X re-tuning for BBS+HHS NN/NT/TT, F8 NT (#2298)

Another f8 change not listed. aquavanjaram942 tuning for BBS+HHS+F8 NN+TN (#2219)

nakajee commented 6 months ago

Some difference in TF32 logic files in aquavanjaram942 folder as well.

nakajee commented 6 months ago

tensile_tag should be updated to this. https://github.com/ROCm/Tensile/commit/d0314ce13203028e017ee0712295916be1c9a44a

babakpst commented 6 months ago

@nakajee we only need f16 and f32 for 942X for this PR.

babakpst commented 6 months ago

I updated the tag.

babakpst commented 6 months ago

I merged the other conflicting/fix.