[E2E_baseline] Dynamo Benchmark E2E Accuracy test torchbench has some fail_accuracy models

chuanqi129 commented 6 months ago

🐛 Describe the bug

Torchbench has some models failed on accuracy check, the detail model list can be found as below table.

precision	mode	model
bfloat16	inference	pytorch_stargan
bfloat16	inference	hf_Whisper
bfloat16	inference	BERT_pytorch
bfloat16	inference	hf_distil_whisper
bfloat16	inference	squeezenet1_1
bfloat16	inference	hf_BigBird
bfloat16	inference	mnasnet1_0
bfloat16	inference	shufflenet_v2_x1_0
bfloat16	inference	hf_Reformer
bfloat16	inference	timm_efficientnet
bfloat16	inference	timm_nfnet
bfloat16	inference	timm_regnet
bfloat16	inference	timm_resnest
bfloat16	training	BERT_pytorch
bfloat16	training	hf_Whisper
bfloat16	training	squeezenet1_1
bfloat16	training	maml_omniglot
bfloat16	training	fastNLP_Bert
bfloat16	training	mnasnet1_0
bfloat16	training	basic_gnn_gin
bfloat16	training	shufflenet_v2_x1_0
bfloat16	training	hf_Reformer
bfloat16	training	timm_efficientnet
bfloat16	training	timm_nfnet
bfloat16	training	timm_regnet
bfloat16	training	timm_resnest
float16	inference	pytorch_stargan
float16	inference	BERT_pytorch
float16	inference	hf_Whisper
float16	inference	moondream
float16	inference	densenet121
float16	inference	hf_distil_whisper
float16	inference	squeezenet1_1
float16	inference	hf_BigBird
float16	inference	resnet18
float16	inference	mnasnet1_0
float16	inference	pyhpc_equation_of_state
float16	inference	mobilenet_v2
float16	inference	shufflenet_v2_x1_0
float16	inference	hf_Reformer
float16	inference	timm_efficientnet
float16	inference	timm_nfnet
float16	inference	timm_regnet
float16	inference	timm_resnest
float16	training	hf_Whisper
float16	training	BERT_pytorch
float16	training	speech_transformer
float16	training	moondream
float16	training	squeezenet1_1
float16	training	mnasnet1_0
float16	training	shufflenet_v2_x1_0
float16	training	hf_Reformer
float16	training	timm_efficientnet
float16	training	timm_nfnet
float16	training	timm_regnet
float16	training	timm_resnest
float32	inference	detectron2_maskrcnn_r_50_fpn
float32	inference	hf_Longformer
float32	inference	detectron2_maskrcnn
float32	inference	detectron2_maskrcnn_r_101_c4
float32	training	resnet50_quantized_qat
float32	training	mobilenet_v2_quantized_qat
float32	training	shufflenet_v2_x1_0
float32	training	timm_resnest

Versions

Pytorch: git clone -b e2e-baseline https://github.com/etaf/pytorch-inductor-xpu pytorch Test script: inductor_xpu_test.sh

riverliuintel commented 6 months ago

@etaf have a look whether there is common issue. Please have the first round of check. For details model failure triage, Yunfei will cover it.

etaf commented 6 months ago

There is no issue that already found.

chuanqi129 commented 3 months ago

Close it as we have refreshed baseline

intel / torch-xpu-ops

[E2E_baseline] Dynamo Benchmark E2E Accuracy test torchbench has some fail_accuracy models #110

🐛 Describe the bug

Versions