intel / torch-xpu-ops

Apache License 2.0
25 stars 18 forks source link

[E2E]TorchBench Bf16 yolov3 fails #414

Open fengyuan14 opened 3 months ago

fengyuan14 commented 3 months ago

🐛 Describe the bug

https://github.com/intel/torch-xpu-ops/actions/runs/9542385293/job/26297127499?pr=407 xpu,yolov3,4,fail_accuracy,533,2,6,4,0,0,2 num_total: 16 num_passed: 15 num_failed: 1 pass_rate: 93.75%

Versions

Latest PyTorch main and torch-xpu-ops main

fengyuan14 commented 3 months ago

Skipped the model in pre-ci. Please retrieve it once the bug is fixed.

etaf commented 3 months ago

This bug is introduced by https://github.com/pytorch/pytorch/pull/128269, I'll continue to investigate.

etaf commented 3 months ago

The accuracy issue is introduceed by https://github.com/pytorch/pytorch/pull/128269, which tirgered a tirton accuracy bug, we'll update triton to resolve this issue and the issue https://github.com/intel/torch-xpu-ops/issues/408 .

chuanqi129 commented 2 months ago

@mengfei25 please help to double check it. If this issue have been fixed, please close it

etaf commented 2 months ago

Please keep this open issue, we've not updated triton.