Open cgmb opened 3 days ago
There was a similar problem on Rembrandt (gfx1035), which I addressed with a patch. It doesn't seem to have helped with gfx1033, though.
diff --git a/src/image/image_manager_kv.cpp b/src/image/image_manager_kv.cpp
index 5d3750e..8c3f2ec 100755
--- a/src/image/image_manager_kv.cpp
+++ b/src/image/image_manager_kv.cpp
@@ -97,6 +97,11 @@ hsa_status_t ImageManagerKv::Initialize(hsa_agent_t agent_handle) {
agent_, static_cast<hsa_agent_info_t>(HSA_AMD_AGENT_INFO_ASIC_FAMILY_ID), &family_type_);
assert(status == HSA_STATUS_SUCCESS);
+ uint32_t chip_revision;
+ status = HSA::hsa_agent_get_info(
+ agent_, static_cast<hsa_agent_info_t>(HSA_AMD_AGENT_INFO_ASIC_REVISION), &chip_revision);
+ assert(status == HSA_STATUS_SUCCESS);
+
HsaGpuTileConfig tileConfig = {0};
unsigned int tc[40];
unsigned int mtc[40];
@@ -125,7 +130,7 @@ hsa_status_t ImageManagerKv::Initialize(hsa_agent_t agent_handle) {
}
addr_create_input.chipFamily = family_type_;
- addr_create_input.chipRevision = 0; // TODO(bwicakso): find how to get this.
+ addr_create_input.chipRevision = chip_revision;
ADDR_CREATE_FLAGS create_flags = {};
create_flags.value = 0;
Problem Description
There seems to be a failing assertion in libhsa-runtime64 when running on the Valve Steam Deck (gfx1033). This behaviour was observed in ROCm 5.7.1 and ROCm 6.1.2, but was not seen in ROCm 5.2.3.
This assertion occurs when running the the test suites for all ROCm libraries. For example, when running hipsolver-test:
That corresponds to the "Unknown chip revision" assertion in the code:
https://ci.rocm.debian.net/packages/h/hipsolver/unstable/amd64+gfx1033/43073/
Operating System
Debian 13
CPU
AMD Custom APU 0405
GPU
AMD Custom GPU 0405
ROCm Version
ROCm 6.1.0
ROCm Component
No response
Steps to Reproduce
No response
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
Additional Information
No response