[inductor][cpu] inductor_max_autotune models accuracy failure in 2024-08-10 nightly release

zxd1997066 commented 1 month ago

🐛 Describe the bug

fp32 static shape default wrapper

suite	name	thread	accuracy	perf	reason(reference only)
huggingface	DebertaV2ForQuestionAnswering	multiple	X	√	DebertaV2ForQuestionAnswering, fail_accuracy
timm_models	jx_nest_base	multiple	X	√	jx_nest_base, fail_accuracy
timm_models	swin_base_patch4_window7_224	multiple	X	X	swin_base_patch4_window7_224, KeyError: m_start
timm_models	twins_pcpvt_base	multiple	X	√	twins_pcpvt_base, fail_accuracy

fp32 dynamic shape default wrapper

suite	name	thread	accuracy	perf	reason(reference only)
huggingface	DebertaV2ForQuestionAnswering	multiple	X	√	DebertaV2ForQuestionAnswering, fail_accuracy

``` E0814 16:28:00.551836 56121 torch/_dynamo/utils.py:1541] RMSE (res-fp64): nan, (ref-fp64): 0.00000 and shape=torch.Size([8, 1000]). res.dtype: torch.float32, multiplier: 2.000000, tol: 0.001000 fail_accuracy ``` ### Versions

SW info

name	target_branch	target_commit	refer_branch	refer_commit
torchbench	main	23512dbe	main	23512dbe
torch	main	6ec4af6865dd884f984c9dbcb273ae26e3825481	main	1d1d074072ecb0aa6ca95e3f43221d2275e16d74
torchvision	main	0.19.0a0+d23a6e1	main	0.19.0a0+d23a6e1
torchtext	main	0.16.0a0+b0ebddc	main	0.16.0a0+b0ebddc
torchaudio	main	2.4.0a0+b3f6f51	main	2.4.0a0+69b2a0a
torchdata	main	0.7.0a0+11bb5b8	main	0.7.0a0+11bb5b8
dynamo_benchmarks	main	nightly	main	fea73cb

Repro: [inductor_single_run.sh](https://github.com/chuanqi129/inductor-tools/blob//weizhuoz/enable_max_autotune_for_guilty/scripts/modelbench/inductor_single_run.sh) bash inductor_single_run.sh multiple inference accuracy **suite** **model** float32 first static/dynamic default 0 inductor_max_autotune Suspected guilty commit: https://github.com/pytorch/pytorch/commit/7911b7bfb770e71a87a007addb6de819ac911c4f [huggingface-DebertaV2ForQuestionAnswering-inference-float32-dynamic-default-multiple-accuracy-crash_guilty_commit.log](https://github.com/user-attachments/files/16615956/huggingface-DebertaV2ForQuestionAnswering-inference-float32-dynamic-default-multiple-accuracy-crash_guilty_commit.log) cc @ezyang @chauhang @penguinwu @WeizhuoZhang-intel @chuanqi129

zxd1997066 commented 1 month ago

convnext_base fp32 statci shape default wrapper shows the same error msg, but I can not reproduce the pass status.

E0814 17:47:54.617337 58277 torch/_dynamo/utils.py:1541] RMSE (res-fp64): nan, (ref-fp64): 0.00000 and shape=torch.Size([8, 1000]). res.dtype: torch.float32, multiplier: 2.000000, tol: 0.001000
fail_accuracy

chunyuan-w commented 3 weeks ago

Most are fixed by https://github.com/pytorch/pytorch/pull/133070 and https://github.com/pytorch/pytorch/pull/133073. One remaining issue is: jx_nest_base

chunyuan-w commented 3 days ago

jx_nest_base will be fixed by https://github.com/pytorch/pytorch/pull/135661

pytorch / pytorch

[inductor][cpu] inductor_max_autotune models accuracy failure in 2024-08-10 nightly release #133465

🐛 Describe the bug