xuhuisheng / rocm-build

build scripts for ROCm
Apache License 2.0
181 stars 35 forks source link

navi14 installing tips #18

Closed thefrol closed 2 years ago

thefrol commented 2 years ago

finally compiled all supposed scripts for NAVI14 sucessfully! Thank you

for MIopen I needed some additional packages libslite-dev libboost-all-dev texlive-latex-recommended libbz2-dev half libghc-half-dev

So i am wondering if rocminfo and clinfo should show my 5500xt as agent. Now i cant see my gpu there. So maybe i should install drivers first or smth?

Thank you for your work and please explain what should i see in rocminfo and clinfo

xuhuisheng commented 2 years ago

The rocminfo should shows likes gfx1012. Here is my gfx803 rocminfo output, The Name part should be gfx1012, Marketing Name may be RX5500, I am not sure because I haven't a rx5500 card.

*******                  
Agent 3                  
*******                  
  Name:                    gfx803                             
  Uuid:                    GPU-XX                             
  Marketing Name:          Radeon RX 580 Series               
  Vendor Name:             AMD                                
  Feature:                 KERNEL_DISPATCH                    
  Profile:                 BASE_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        128(0x80)                          
  Queue Min Size:          4096(0x1000)                       
  Queue Max Size:          131072(0x20000)                    
  Queue Type:              MULTI                              
  Node:                    2                                  
  Device Type:             GPU                                
  Cache Info:              
    L1:                      16(0x10) KB

You may meet the PCIe atomics operation problems. As ROCm team said that navi GPU need PCIe atomics supporting by both CPU and motherboard. https://github.com/RadeonOpenCompute/ROCm#supported-cpus

You may see kernel logs by dmesg | grep kfd Likes:

kfd: skipped device 1002:7300, PCI rejects atomics

If there is the atomics supporting issue. The solution is check whether your cpu and motherboard support the PCIe atomics operation. For examples, cpu must be newer than/ or equals intel v4(Hawaii), if your motherboard have more than one PCIe slots, you can switch the GPU and try agian, because most of motherboard only have one PCIe slot supporting atomics operation. And don't use PCIe bridge/switcher, they cannot support atomics operation.

If you cpu is older than intel v4(Hawaii), I am afraid you have to upgrade your hardwares.

UPDATE: And I have to clarify that It is most possibly that ROCm cannot support navi now. I suggest we just wait for offical support on navi21, and using navi21 codebase to try to compile for navi10 target. And yes, navi21 need PCIe atomics, too.

But I still haven't a navi card to test.

thefrol commented 2 years ago

[ 2.694307] kfd kfd: amdgpu: skipped device 1002:7340, PCI rejects atomics

well a big pity ;(

Was hoping for elder generations dont need atomics as metioned here Beginning with ROCm 1.8, GFX9 GPUs (such as Vega 10) no longer require PCIe atomics. We have similarly opened up more options for number of PCIe lanes. GFX9 GPUs can now be run on CPUs without PCIe atomics and on older PCIe generations, such as PCIe 2.0. This is not supported on GPUs below GFX9, e.g. GFX8 cards in the Fiji and Polaris families.

Well, great thanks for your work and support <3

xuhuisheng commented 2 years ago

The truth is only gfx9 didn't require PCIe atomics feature. (gfx7 is too old.) And gfx9 is the real offcial supporting series: gfx900, gfx906, gfx908, gfx90a. aka. Vega56/64, Radeon VII, MI100, MI200

Older card like gfx803, New cards like gfx101x, gfx103x, They all need the PCIe atomics feature. Right now, I didn't know what the atomics features affects. It is appreciate if someone could told me how to like gfx803 ignore atomics requirement. :D

thefrol commented 2 years ago

Well looks like i need to find some atomics. Thank you!

xuhuisheng commented 2 years ago

2 months after last posts, I will close this issue, please reopen if there is any updates.