Fine-tuning wav2vec 2.0 with `torch.compile`

w11wo commented 1 year ago

System Info

transformers version: 4.28.1
Platform: Linux-4.19.0-23-cloud-amd64-x86_64-with-glibc2.28
Python version: 3.9.0
Huggingface_hub version: 0.13.3
Safetensors version: not installed
PyTorch version (GPU?): 2.0.0+cu117 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: Yes
Using distributed or parallel set-up in script?: No

Who can help?

No response

Information

[x] The official example scripts
[ ] My own modified scripts

Tasks

[X] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

python run_audio_classification.py \
    --model_name_or_path facebook/wav2vec2-base \
    --dataset_name superb \
    --dataset_config_name ks \
    --output_dir wav2vec2-base-ft-keyword-spotting \
    --overwrite_output_dir \
    --remove_unused_columns False \
    --do_train \
    --do_eval \
    --fp16 \
    --learning_rate 3e-5 \
    --max_length_seconds 1 \
    --attention_mask False \
    --warmup_ratio 0.1 \
    --num_train_epochs 5 \
    --per_device_train_batch_size 32 \
    --gradient_accumulation_steps 4 \
    --per_device_eval_batch_size 32 \
    --dataloader_num_workers 4 \
    --logging_strategy steps \
    --logging_steps 10 \
    --evaluation_strategy epoch \
    --save_strategy epoch \
    --load_best_model_at_end True \
    --metric_for_best_model accuracy \
    --save_total_limit 3 \
    --seed 0 \
+   --torch_compile True

Expected behavior

I followed the example to fine-tune wav2vec 2.0 for audio classification, with the exception of using torch.compile, aiming to get faster training. However, I ran to an issue as follows

Error Log

``` [INFO|trainer.py:1769] 2023-04-19 05:28:50,832 >> ***** Running training ***** [INFO|trainer.py:1770] 2023-04-19 05:28:50,832 >> Num examples = 51,094 [INFO|trainer.py:1771] 2023-04-19 05:28:50,832 >> Num Epochs = 5 [INFO|trainer.py:1772] 2023-04-19 05:28:50,832 >> Instantaneous batch size per device = 32 [INFO|trainer.py:1773] 2023-04-19 05:28:50,832 >> Total train batch size (w. parallel, distributed & accumulation) = 128 [INFO|trainer.py:1774] 2023-04-19 05:28:50,833 >> Gradient Accumulation steps = 4 [INFO|trainer.py:1775] 2023-04-19 05:28:50,833 >> Total optimization steps = 1,995 [INFO|trainer.py:1776] 2023-04-19 05:28:50,834 >> Number of trainable parameters = 90,371,212 0%| | 0/1995 [00:00 main() File "/home/wilson_bookbotkids_com/run_audio_classification.py", line 392, in main train_result = trainer.train(resume_from_checkpoint=checkpoint) File "/opt/conda/envs/torch/lib/python3.9/site-packages/transformers/trainer.py", line 1662, in train return inner_training_loop( File "/opt/conda/envs/torch/lib/python3.9/site-packages/transformers/trainer.py", line 1929, in _inner_training_loop tr_loss_step = self.training_step(model, inputs) File "/opt/conda/envs/torch/lib/python3.9/site-packages/transformers/trainer.py", line 2699, in training_step loss = self.compute_loss(model, inputs) File "/opt/conda/envs/torch/lib/python3.9/site-packages/transformers/trainer.py", line 2731, in compute_loss outputs = model(**inputs) File "/opt/conda/envs/torch/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/opt/conda/envs/torch/lib/python3.9/site-packages/torch/_dynamo/eval_frame.py", line 82, in forward return self.dynamo_ctx(self._orig_mod.forward)(*args, **kwargs) File "/opt/conda/envs/torch/lib/python3.9/site-packages/torch/_dynamo/eval_frame.py", line 209, in _fn return fn(*args, **kwargs) File "/opt/conda/envs/torch/lib/python3.9/site-packages/transformers/models/wav2vec2/modeling_wav2vec2.py", line 1817, in forward outputs = self.wav2vec2( File "/opt/conda/envs/torch/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/opt/conda/envs/torch/lib/python3.9/site-packages/transformers/models/wav2vec2/modeling_wav2vec2.py", line 1316, in forward hidden_states = self._mask_hidden_states( File "/opt/conda/envs/torch/lib/python3.9/site-packages/transformers/models/wav2vec2/modeling_wav2vec2.py", line 1249, in _mask_hidden_states if not getattr(self.config, "apply_spec_augment", True): File "/opt/conda/envs/torch/lib/python3.9/site-packages/transformers/models/wav2vec2/modeling_wav2vec2.py", line 1259, in mask_time_indices = _compute_mask_indices( File "/opt/conda/envs/torch/lib/python3.9/site-packages/transformers/models/wav2vec2/modeling_wav2vec2.py", line 1266, in mask_time_indices = torch.tensor(mask_time_indices, device=hidden_states.device, dtype=torch.bool) File "/opt/conda/envs/torch/lib/python3.9/site-packages/torch/_dynamo/eval_frame.py", line 337, in catch_errors return callback(frame, cache_size, hooks) File "/opt/conda/envs/torch/lib/python3.9/site-packages/torch/_dynamo/convert_frame.py", line 404, in _convert_frame result = inner_convert(frame, cache_size, hooks) File "/opt/conda/envs/torch/lib/python3.9/site-packages/torch/_dynamo/convert_frame.py", line 104, in _fn return fn(*args, **kwargs) File "/opt/conda/envs/torch/lib/python3.9/site-packages/torch/_dynamo/convert_frame.py", line 262, in _convert_frame_assert return _compile( File "/opt/conda/envs/torch/lib/python3.9/site-packages/torch/_dynamo/utils.py", line 163, in time_wrapper r = func(*args, **kwargs) File "/opt/conda/envs/torch/lib/python3.9/site-packages/torch/_dynamo/convert_frame.py", line 324, in _compile out_code = transform_code_object(code, transform) File "/opt/conda/envs/torch/lib/python3.9/site-packages/torch/_dynamo/bytecode_transformation.py", line 445, in transform_code_object transformations(instructions, code_options) File "/opt/conda/envs/torch/lib/python3.9/site-packages/torch/_dynamo/convert_frame.py", line 311, in transform tracer.run() File "/opt/conda/envs/torch/lib/python3.9/site-packages/torch/_dynamo/symbolic_convert.py", line 1726, in run super().run() File "/opt/conda/envs/torch/lib/python3.9/site-packages/torch/_dynamo/symbolic_convert.py", line 576, in run and self.step() File "/opt/conda/envs/torch/lib/python3.9/site-packages/torch/_dynamo/symbolic_convert.py", line 540, in step getattr(self, inst.opname)(inst) File "/opt/conda/envs/torch/lib/python3.9/site-packages/torch/_dynamo/symbolic_convert.py", line 1792, in RETURN_VALUE self.output.compile_subgraph( File "/opt/conda/envs/torch/lib/python3.9/site-packages/torch/_dynamo/output_graph.py", line 517, in compile_subgraph self.compile_and_call_fx_graph(tx, list(reversed(stack_values)), root) File "/opt/conda/envs/torch/lib/python3.9/site-packages/torch/_dynamo/output_graph.py", line 588, in compile_and_call_fx_graph compiled_fn = self.call_user_compiler(gm) File "/opt/conda/envs/torch/lib/python3.9/site-packages/torch/_dynamo/utils.py", line 163, in time_wrapper r = func(*args, **kwargs) File "/opt/conda/envs/torch/lib/python3.9/site-packages/torch/_dynamo/output_graph.py", line 675, in call_user_compiler raise BackendCompilerFailed(self.compiler_fn, e) from e torch._dynamo.exc.BackendCompilerFailed: debug_wrapper raised DynamicOutputShapeException: aten.index.Tensor Set torch._dynamo.config.verbose=True for more information You can suppress this exception and fall back to eager by setting: torch._dynamo.config.suppress_errors = True ```

I suspect that wav2vec 2.0 is not yet supported in PyTorch 2.0 and needs some modification to ensure compatibility when running torch.compile. The same error occurred when fine-tuning for automatic speech recognition.

amyeroberts commented 1 year ago

cc @sanchit-gandhi

amyeroberts commented 1 year ago

Hi @w11wo, thanks for raising this issue!

Please note that whilst we aim to support a wide variety of use cases with our examples, torch_compile is an experimental flag and not one we guarantee will work for for all of our models as the support is progressively rolled in in PyTorch.

w11wo commented 1 year ago

Hi @amyeroberts, no worries and thanks for the heads up. Looking forward to seeing wav2vec 2.0 supported. Cheers.