NVIDIA / Fuser

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
Other
271 stars 53 forks source link

Use a single elect sync ite for all trasactions #3314

Closed zasdfgbnm closed 3 weeks ago

zasdfgbnm commented 3 weeks ago

Before:

if (elect-sync) {
  arrive-expect-tx1;
  tma1;
}
if (elect-sync) {
  arrive-expect-tx2;
  tma2;
}

After:

if (elect-sync) {
  arrive-expect-tx1;
  tma1;
  arrive-expect-tx2;
  tma2;
}

Perf:

 Time (%)  Total Time (ns)  Instances  Avg (ns)  Med (ns)  Min (ns)  Max (ns)  StdDev (ns)                                                  Name

 --------  ---------------  ---------  --------  --------  --------  --------  -----------  ----------------------------------------------------------------------------------------------------
     36.0           151775          1  151775.0  151775.0    151775    151775          0.0  <unnamed>::nvfuser_none_f0_c0_r0_g0(<unnamed>::Tensor<<unnamed>::__half, (int)3, (int)3>, <unnamed>…
     20.7            87135          1   87135.0   87135.0     87135     87135          0.0  nvjet_hsh_256x128_64x4_1x2_h_bz_coopA_NTT

nvFuser/cuBLAS = 57.4%.

zasdfgbnm commented 3 weeks ago

!build