[X] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)
Reproduction
import torch
from transformers import AutoProcessor, WhisperForConditionalGeneration
from datasets import load_dataset, Audio
processor = AutoProcessor.from_pretrained("openai/whisper-tiny.en")
model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-tiny.en")
model.cuda()
# load audios > 30 seconds
ds = load_dataset("distil-whisper/meanwhile", "default")["test"]
# resample to 16kHz
ds = ds.cast_column("audio", Audio(sampling_rate=16000))
# take first 8 audios and retrieve array
audio = ds[:8]["audio"]
audio = [x["array"] for x in audio]
# make sure to NOT truncate the input audio, to return the `attention_mask` and to pad to the longest audio
inputs = processor(audio, return_tensors="pt", truncation=False, padding="longest", return_attention_mask=True, sampling_rate=16_000)
inputs = inputs.to("cuda", torch.float32)
# transcribe audio to ids
generated_ids = model.generate(
**inputs,
)
transcription = processor.batch_decode(generated_ids, skip_special_tokens=True)
transcription[0]
Expected behavior
When an attention_mask is passed to generate(), the following warning pops up indicating that an attention_mask was not set:
The attention mask is not set and cannot be inferred from input because pad token is same as eos token.As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
System Info
transformers
version: 4.44.0.dev0Who can help?
@sanchit-gandhi
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Expected behavior
When an
attention_mask
is passed togenerate()
, the following warning pops up indicating that anattention_mask
was not set:I think this is because
attention_mask
is not actually passed down togenerate_with_fallback
, so it doesn't get passed to the underlyingsuper().generate()
call