huggingface / distil-whisper

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
MIT License
3.33k stars 238 forks source link

ddp ERROR #71

Open liyingjie1991 opened 5 months ago

liyingjie1991 commented 5 months ago

hi, when I run the training code, I met the following error. Can you give me some advice? ` File "/ssd5/exec/liyj/miniconda3/envs/seamless/lib/python3.9/site-packages/torch/_dynamo/utils.py", line 163, in time_wrapper r = func(*args, **kwargs) File "/ssd5/exec/liyj/miniconda3/envs/seamless/lib/python3.9/site-packages/torch/_dynamo/output_graph.py", line 675, in call_user_compiler raise BackendCompilerFailed(self.compiler_fn, e) from e torch._dynamo.exc.BackendCompilerFailed: compile_fn raised TypeError: _convert_frame_assert() missing 1 required positional argument: 'hooks'

Set torch._dynamo.config.verbose=True for more information

You can suppress this exception and fall back to eager by setting: torch._dynamo.config.suppress_errors = True

Traceback (most recent call last): File "/ssd5/exec/liyj/miniconda3/envs/seamless/lib/python3.9/site-packages/torch/_dynamo/output_graph.py", line 670, in call_user_compiler compiled_fn = compiler_fn(gm, self.fake_example_inputs()) File "/ssd5/exec/liyj/miniconda3/envs/seamless/lib/python3.9/site-packages/torch/_dynamo/backends/distributed.py", line 203, in compile_fn return self.backend_compile_fn(gm, example_inputs) TypeError: _convert_frame_assert() missing 1 required positional argument: 'hooks'`

Version: torch: '2.0.1+cu117'

sanchit-gandhi commented 5 months ago

Hey @liyingjie1991 - are you using torch compile while training? I personally didn't test training with this configuration, but would expect it to work for training as expected (static shapes). The generate step during evaluation probably won't work, since we use a dynamic k/v cache in Transformers, and so have dynamic shapes. If you're using torch compile, could you try disabling it for evaluation?