Closed gcongiu closed 1 month ago
I verified this with rocm-5.5.0 stable release and the problem is also present there.
@ammarwa any update on this?
Added reproducer issue-113.tar.gz
@gcongiu Apologies for the lack of response. Can you please check if your issue still exists with the latest ROCm 6.2? If so, we will further investigate the issue. Thanks!
Hi @gcongiu, I have not been able to reproduce this on ROCm 6.2 using 2 MI210s and the reproducer code you provided. Can you try upgrading to ROCm 6.2 and check if the issue persists?
@gcongiu I'm going to close this issue due to inactivity. If the issue is still present, feel free to re-open the ticket and we can look into it further.
Running
intercept_multi_thread_monitoring
test in papi (located inpapi/src/components/rocm/tests/intercept_multi_thread_monitoring
) deadlocks. The test runs a matrix-to-matrix multiplication kernel and uses multiple threads, each driving and monitoring its own dedicated GPU. The backtrace from rocgdb follows:The version of ROCm used to reproduce this problem is 5.5.0 RC5. Test was ran on two MI210s.