intel / torch-xpu-ops

Apache License 2.0
28 stars 20 forks source link

Performance: Reduction: Worse host overhead compared with IPEX #970

Open fengyuan14 opened 3 weeks ago

fengyuan14 commented 3 weeks ago

🐛 Describe the bug

CPU time as below,

override             aten::sum         2.52%      24.765ms         4.07%      39.978ms      60.849us 
non-override         aten::sum         4.49%      46.832ms         5.74%      59.905ms      91.180us

Versions

Latest torch-xpu-ops vs IPEX 2.3 implementation.

majing921201 commented 1 week ago

Same root cause, https://github.com/intel/llvm/issues/15824