intel / torch-xpu-ops

Apache License 2.0
30 stars 21 forks source link

[E2E] Torchbench detectron2_fasterrcnn_r_101_c4 amp_fp16 inference accuracy failed #728

Open mengfei25 opened 3 months ago

mengfei25 commented 3 months ago

🐛 Describe the bug

torchbench_amp_fp16_inference

WARNING:common:fp64 golden ref were not generated for detectron2_fasterrcnn_r_101_c4. Setting accuracy check to cosine WARNING:common:current_device=xpu; error:dets should have the same type as scores W0803 06:43:58.821000 3103334 torch/_dynamo/utils.py:1499] Similarity score=0.8771064281463623 E0803 06:43:58.822000 3103334 torch/_dynamo/utils.py:1450] Accuracy failed for key name pred_classes E0803 06:43:58.823000 3103334 torch/_dynamo/utils.py:1450] Accuracy failed for key name instances fail_accuracy

loading model: 0it [00:00, ?it/s][W803 06:44:07.346454027 RegisterXPU.cpp:7580] Warning: Aten Op fallback from XPU to CPU happends. This may have performance implications. If need debug the fallback ops please set environment variable PYTORCH_DEBUG_XPU_FALLBACK=1 (function operator())

loading model: 0it [00:58, ?it/s]

Versions

torch-xpu-ops: https://github.com/intel/torch-xpu-ops/commit/1d70431c072db889d9a47ea4956049fe340a426d pytorch: d224857b3af5c9d5a3c7a48401475c09d90db296 device: pvc 1100, bundle: 0.5.3, driver: 803.61

retonym commented 3 months ago

low priority for not included in Meta PyTorch dashboard

mengfei25 commented 3 months ago

A100 are also failed for failed of detectron2 installation

retonym commented 5 days ago

There are several model failure, such as detectron, are not included in meta dashboard. We may have a discussion on whether to track these model status currently.

retonym commented 5 days ago

These models are not included in meta dashboard, not target to PT2.6