Closed hcman2 closed 1 week ago
This is just an example.
CI gfx94x Ubuntu pass.
---- generated xml file: /meng/hcman/hipBLASLt/tensilelite/python_tests.xml ---- ========== 23 passed, 83 skipped, 372 warnings in 2019.89s (0:33:39) =========== local gfx90a passed
Originally, we will always wait all of the PLR to be done in 1st iteration. This optimization separates it into 2 waitcnt. It will release the pressure if the kernel is blocked by the 1st waitcnt.