Closed zjin-lcf closed 2 years ago
Could you show me the output of lspci -n | grep 1002
?
I suspect the problem is somehow related to the fact that we incorrectly reported Renoir APUs as gfx902 in the Thunk for quite a whil e(https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/commit/6745ced5dd3ed850c2a0f449fbb839f1dfc4eeb3). However, if you're running an old Thunk, I would imagine that rocminfo
would report the wrong thing (since it goes through ROCr and the Thunk to get this information), while rocm_agent_enumerator
would report the right thing (since it checks the HSA topology directly).
What kernel version, Thunk version, and ROCr version are you running?
Since rocminfo reports the right thing, do you think that rocm_agent_enumerator needs to be updated ?
Thanks.
I'm still trying to debug thie issue, hopefully with your help. :)
Could you please show me he output of lspci -n | grep 1002
? What Linux kernel version, Thunk version, and ROCr version are you running?
Sorry. I will close the issue. It is not reproducible. Thanks.
I installed rocminfo from source. However, the results from rocminfo, and the python script, rocm_agent_enumerator are not consistent for a device:
rocm_agent_enumerator shows the agent is gfx902.
rocminfo shows the agent is gfx90c.
Thank you for your answer.