ROCm / Tensile

Stretching GPU performance for GEMMs and tensor contractions.
MIT License
208 stars 142 forks source link

[Feature]: support for gfx1103 #1922

Open NeoChen1024 opened 3 months ago

NeoChen1024 commented 3 months ago

Suggestion Description

Will there be support for gfx1103 any time soon?

Operating System

Arch Linux

GPU

Radeon 780M

ROCm Component

rocBLAS and Tensile

cgmb commented 1 month ago

I think Fedora is carrying patches that enable rocBLAS and Tensile on gfx1103. However, it's not officially supported by AMD and I'm not sure how well it works.

cgmb commented 1 month ago

Tensile Patch: https://src.fedoraproject.org/rpms/python-tensile/blob/6f308b0956b7736ae874b07f8ebc9f404fa2fae5/f/0001-enable-gfx1103-for-Tensile.patch rocBLAS Patch: https://src.fedoraproject.org/rpms/rocblas/blob/74df24057a4579f507a50431aaa96ae7484d1567/f/0001-add-gfx1103-support-for-rocBLAS.patch

lamikr commented 1 month ago

On rocm sdk builder gfx1103 is also now initially supported. I have tested it with the framework 16 laptop where I have both the gfx1102 and gfx1103.

https://github.com/lamikr/rocm_sdk_builder

We have there patches for rocBLAS to add support for some other gpus also.

lamikr commented 1 month ago

At the moment on rocm_sdk_builder we are working for tuning the logic improvement but there I am seeing problem at least when using Tensile from rocm-6.1.2 release.

If I run (or example_vega10_tuning.yaml) Tensile/Tensile/bin/Tensile example_gfx1035_tuning.yaml . > tuning1.out 2>&1

I get following error:

terminate called after throwing an instance of 'std::invalid_argument'
what(): stoi

So far I have been able to trace that they come from ResultFileReporter.cpp and I can get rid from them by commenting these 3 stoi conversions.

            else if(key == ResultKey::GfxFrequency)
            {
                //m_currGfxClock = static_cast<uint16_t>(std::stoi(valueStr));
                m_currGfxClock = 0;
            }
            else if(key == ResultKey::Power)
            {
                //m_currPower = static_cast<uint16_t>(std::stoi(valueStr));
                m_currPower = 0;
            }
            else if(key == ResultKey::TemperatureHot)
            {
                //m_currTemperatureHot = static_cast<uint16_t>(std::stoi(valueStr));
                m_currTemperatureHot = 0;
            }

Is there any easy way to printout the valueStr to stdout or stderr from this code that is run on the GPU?

I searched and similar looking error with stoi was reported in one comment on pull request https://github.com/ROCm/Tensile/pull/1888