Closed etaf closed 2 months ago
@alexbaden can you help to check?
@etaf Please verify. Do you also need this change to be backported to the release branch?
@whitneywhtsang Sorry for my late update, thanks for your work, and currently no need to backport to the release branch.
Hi, we found two implementation issues in triton's
do_bench
when enabling max-autotune feature in Pytorch.do_bench
should return the number ofms
, but currently isns
. https://github.com/intel/intel-xpu-backend-for-triton/blob/1b2f15840e0d70eec50d84c7a0575cb835524def/python/triton/testing.py#L12-L21do_bench
, ifUSE_WALL_TIME
, there should be asynchronize()
between start/end time record, But currently is not. https://github.com/intel/intel-xpu-backend-for-triton/blob/1b2f15840e0d70eec50d84c7a0575cb835524def/python/triton/testing.py#L139-L148