pytorch / torchdynamo

A Python-level JIT compiler designed to make unmodified PyTorch programs faster.
BSD 3-Clause "New" or "Revised" License
1.01k stars 124 forks source link

Minifier fails for wav2vec #1959

Closed wconstab closed 1 year ago

wconstab commented 1 year ago

🐛 Describe the bug

Building on https://github.com/pytorch/pytorch/issues/93464,

the whole-model repro for wav2vec posted there fails to run with the minifier for me on master @ 2ea32f41f4b4c3d0cbb9834186fdfe404e0d4c2a

my w2v.py here is copied from pytorch/pytorch#93464, namely

 from transformers import Wav2Vec2Processor, Wav2Vec2ForCTC
 from datasets import load_dataset
 import torch

 # load model and processor
 processor = Wav2Vec2Processor.from_pretrained("facebook/wav2vec2-large-960h-lv60-self")
 model = Wav2Vec2ForCTC.from_pretrained("facebook/wav2vec2-large-960h-lv60-self")
 model=torch.compile(model)

 # load dummy dataset and read soundfiles
 ds = load_dataset("patrickvonplaten/librispeech_asr_dummy", "clean", split="validation")

 # tokenize
 input_values = processor(ds[0]["audio"]["array"], return_tensors="pt", padding="longest").input_values

 # retrieve logits
 logits = model(input_values).logits

 # take argmax and decode
 predicted_ids = torch.argmax(logits, dim=-1)
 transcription = processor.batch_decode(predicted_ids)

Error logs

No response

Minified repro

TORCHDYNAMO_REPRO_AFTER="aot" python w2v.py produces a minifier_launcher, but running it claims python torchdynamo_debug/run_2022_12_05_23_06_50_594983/minifier/minifier_launcher.py

Traceback (most recent call last):
  File "torchdynamo_debug/run_2022_12_05_23_06_50_594983/minifier/minifier_launcher.py", line 205, in <module>
    minifier(
  File "/scratch/whc/work/pytorch/torch/_functorch/fx_minifier.py", line 96, in minifier
    raise RuntimeError("Input graph did not fail the tester")
RuntimeError: Input graph did not fail the tester

TORCHDYNAMO_REPRO_AFTER="dynamo" python w2v.py actually segfaults for me

wconstab commented 1 year ago

oops, i forgot to use TORCHDYNAMO_REPRO_LEVEL=4. But that mode fails too

[2022-12-05 23:25:44,800] torch._dynamo.debug_utils: [WARNING] While minifying the program in accuracy minification mode,ran into a runtime exception which is likely an unrelated issue. Skipping this graph.

full log: https://gist.github.com/wconstab/4fc3caf44b679e3994622ddbc191dfc0

anijain2305 commented 1 year ago

You don't need level 4. Level 4 is for accuracy. The command that you have is correct.

Your observation is still correct. There is a minifier_launcher.py but it does not fail.

I can assign it to myself.

anijain2305 commented 1 year ago

@wconstab I tried TORCHDYNAMO_REPRO_AFTER="dynamo", and that worked. (The graph is larger than if repro_after "aot" worked, so I will fix that one as well).


from math import inf
import torch
from torch import tensor, device
import torch.fx as fx
import torch._dynamo
from torch._dynamo.testing import rand_strided
from torch._dynamo.debug_utils import run_fwd_maybe_bwd
from torch._dynamo.debug_utils import same_two_models

# REPLACEABLE COMMENT FOR TESTING PURPOSES

args = [((1, 93680), (93680, 1), torch.float32, 'cpu', False)]
args = [rand_strided(sh, st, dt, dev).requires_grad_(rg) for (sh, st, dt, dev, rg) in args]

from torch.nn import *
class Repro(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.self_self_feature_extractor_conv_layers_0_conv = Conv1d(1, 512, kernel_size=(10,), stride=(5,))

    def forward(self, input_values : torch.Tensor):
        getitem = input_values[(slice(None, None, None), None)];  input_values = None
        self_self_feature_extractor_conv_layers_0_conv = self.self_self_feature_extractor_conv_layers_0_conv(getitem);  getitem = None
        return (self_self_feature_extractor_conv_layers_0_conv,)

mod = Repro()
opt_mod = torch._dynamo.optimize("inductor")(mod)

with torch.cuda.amp.autocast(enabled=False):
    ref = run_fwd_maybe_bwd(mod, args)
    res = run_fwd_maybe_bwd(opt_mod, args)
anijain2305 commented 1 year ago

Real bug for the model - https://github.com/pytorch/pytorch/issues/90260

Why does minifier (repro_after="aot") does not work? Again the same reason as above issue. The small difference in code path between minified_repro and compiler led to this rare divergence.