CHIP-SPV / chipStar

chipStar is a tool for compiling and running HIP/CUDA on SPIR-V via OpenCL or Level Zero APIs.
Other
185 stars 30 forks source link

[L0 Performance] zeCommandQueueSynchronize causes 10x slowdown on certain unit tests #641

Closed pvelesko closed 5 months ago

pvelesko commented 11 months ago

Unit_hipMalloc_LoopRegressionAllocFreeCycles takes about 200s vs 20s

Queue::finish():

diff --git a/src/backend/Level0/CHIPBackendLevel0.cc b/src/backend/Level0/CHIPBackendLevel0.cc
index 7ae965b0..5d5fb902 100644
--- a/src/backend/Level0/CHIPBackendLevel0.cc
+++ b/src/backend/Level0/CHIPBackendLevel0.cc
@@ -1486,16 +1486,16 @@ void CHIPQueueLevel0::finish() {
   LOCK(Backend->DubiousLockLevel0)
 #endif

-  // if (static_cast<CHIPBackendLevel0 *>(Backend)->getUseImmCmdLists()) {
+  if (static_cast<CHIPBackendLevel0 *>(Backend)->getUseImmCmdLists()) {
     auto Event = getLastEvent();
     auto EventLZ = std::static_pointer_cast<CHIPEventLevel0>(Event);
     if (EventLZ) {
       auto EventHandle = EventLZ->peek();
       zeEventHostSynchronize(EventHandle, UINT64_MAX);
     }
-  // } else {
-  //   zeCommandQueueSynchronize(ZeCmdQ_, UINT64_MAX);
-  // }
+  } else {
+    zeCommandQueueSynchronize(ZeCmdQ_, UINT64_MAX);
+  }

   return;
dgpu_opencl_make_check_result.txt: PASS
igpu_opencl_make_check_result.txt: FAIL
    892 - TestAssert (Failed)
    893 - TestAssertFail (Failed)
    899 - abort (Failed)
    937 - PrintfSimple (Failed)
    938 - PrintfNOP (Failed)
    939 - PrintfDynamic (Failed)
igpu_level0_reg_make_check_result.txt: FAIL
    890 - TestAssert (Failed)
    891 - TestAssertFail (Failed)
    897 - abort (Failed)
    938 - PrintfSimple (Failed)
    939 - PrintfNOP (Failed)
    940 - PrintfDynamic (Failed)
dgpu_level0_reg_make_check_result.txt: FAIL
    885 - TestAssert (Failed)
    886 - TestAssertFail (Failed)
    892 - abort (Failed)
    929 - PrintfSimple (Failed)
    930 - PrintfNOP (Failed)
    931 - PrintfDynamic (Failed)
dgpu_level0_imm_make_check_result.txt: PASS
pvelesko commented 5 months ago

Fixed in https://github.com/CHIP-SPV/chipStar/pull/817