Closed hxxiaoming closed 5 months ago
Thank you @hxxiaoming for reporting this issue. I will create an internal PR with a fix and update when it becomes available in develop branch.
Thanks iassiour!
Hi @hxxiaoming as it turns out dispatching multiple packets is actually not fully supported so I have removed this option from dispatchGenericAqlPacket (i.e. size is now always equal to 1).
The change is in develop branch https://github.com/ROCm/clr/commit/d7f352dbed62c61390f8ddc363abb641964ad60c
I am closing this issue thank you for raising this.
while ((index - hsa_queue_load_read_index_scacquire(gpuqueue)) >= sw_queue_size) { amd::Os::yield(); }
Here, it only checks if the ring buffer has an empty slot but does not determine how many empty slots are available. If the number of packets to be written is more than one, there may be a problem of overwriting packets that have not been processed yet.