pytorch / torchdynamo

A Python-level JIT compiler designed to make unmodified PyTorch programs faster.
BSD 3-Clause "New" or "Revised" License
1.01k stars 123 forks source link

2 regressions on coat_lite_mini in TIMM #1833

Closed ngimel closed 1 year ago

ngimel commented 1 year ago

https://github.com/pytorch/pytorch/pull/87650 regressed inductor time 0.118->0.14 (1.6x -> 1.35x) https://github.com/pytorch/pytorch/pull/87669 regressed inductor and eager time (eager time 0.19->0.22, inductor time 0.14->0.17) cc @eellison, @eqy command line to repro is python benchmarks/dynamo/timm_models.py --training --performance --device cuda --inductor --float32 --only=coat_lite_mini

@anijain2305 are we reporting absolute times on the dashboard already? Those would be very useful. @desertfire we really should start some perf testing in CI to avoid such regressions. It possibly would be noisy, but it should be possible to catch these relatively large changes

eqy commented 1 year ago

What hardware is the regression observed on?

ngimel commented 1 year ago

A100, cuda 11.6, my cudnn is a bit old, 8.3 I think, is it better with the new one?

anijain2305 commented 1 year ago

@williamwen42 can you look at adding absolute latency numbers in the dashboard?

eellison commented 1 year ago

Fix for the inductor time here: https://github.com/pytorch/pytorch/pull/88534

ngimel commented 1 year ago

For convnext, cudnn v8 APIs break something big time, with cudnn v8 off I'm getting eager time of 0.189 for the following command

TORCH_CUDNN_V8_API_DISABLED=1 python benchmarks/dynamo/timm_models.py --training   --inductor --only convnext_base  --devices=cuda --float16 --batch_size 128 --performance --disable-cudagraphs

with cudnn v8 on eager time is 0.52! and this is not fixed even when I set torch.backends.cudnn.benchmark=True. @eqy I think we have to revert this PR.

eqy commented 1 year ago

Sure, will look into this regression as well. Previous CoAtNet regression has already been forwarded to cuDNN.

eqy commented 1 year ago

https://github.com/pytorch/pytorch/pull/88699 seems to also address the CoAtNet regression.