huggingface / speechbox

Apache License 2.0
342 stars 33 forks source link

'GenerationConfig' object has no attribute 'no_timestamps_token_id' #17

Closed PeterGilles closed 1 year ago

PeterGilles commented 1 year ago

When using a fine-tuned Whisper model, running ASRDiarizationPipeline throws an error:

from speechbox import ASRDiarizationPipeline

pipeline = ASRDiarizationPipeline.from_pretrained("pgilles/whisper-large-v2-lb_cased_03", device=device, use_auth_token="***")
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[<ipython-input-12-17b62c26ae1a>](https://localhost:8080/#) in <module>
----> 1 out = pipeline(audio_file, group_by_speaker=True)
      2 pd.DataFrame(out)

8 frames
[/usr/local/lib/python3.8/dist-packages/transformers/generation/logits_process.py](https://localhost:8080/#) in __init__(self, generate_config)
    934     def __init__(self, generate_config):  # support for the kwargs
    935         self.eos_token_id = generate_config.eos_token_id
--> 936         self.no_timestamps_token_id = generate_config.no_timestamps_token_id
    937         self.timestamp_begin = generate_config.no_timestamps_token_id + 1
    938 

AttributeError: 'GenerationConfig' object has no attribute 'no_timestamps_token_id'
patrickvonplaten commented 1 year ago

cc @sanchit-gandhi

sanchit-gandhi commented 1 year ago

Hey @PeterGilles! Sorry about the delayed reply. Cool to see that you're using the diarization pipeline! May I ask what version of transformers you're using? You can retrieve this with the command:

transformers-cli env
bjelkenhed commented 1 year ago

Hi @sanchit-gandhi I get the same issue when using a fine-tuned model. A standard model like openai/whisper-medium works fine but a fine-tuned seems to be missing some attributes. The model is fine-tuned like in the Whisper event.

These attributes seems to be needed for it to work: model.generation_config.no_timestamps_token_id = 50363 model.generation_config.forced_decoder_ids = [[1, None], [2, 50359]] model.generation_config.max_initial_timestamp_index = 1

Tested with transformers version: 4.27.0.dev0

The issue is the same here so it is not specific to ASRDiarizationPipeline: pipe = pipeline(task="automatic-speech-recognition"

mmichelli commented 1 year ago

I'm getting the same issue: Transformers version: 4.27.0.dev0

File transformers/generation/logits_process.py:935, in WhisperTimeStampLogitsProcessor.init(self, generate_config) AttributeError: 'GenerationConfig' object has no attribute 'no_timestamps_token_id' self.no_timestamps_token_id = generate_config.no_timestamps_token_id

mmichelli commented 1 year ago

Is this the cause of the issue? https://github.com/huggingface/transformers/issues/21220

sanchit-gandhi commented 1 year ago

Indeed, the first point on that issue seems to be the cause of the problem!

I think @bjelkenhed's fix is the quickest here. Otherwise, if you want to copy a pre-trained Whisper generation config one-for-one, you could load a generation config from pre-trained:

from transformers import GenerationConfig

generation_config = GenerationConfig.from_pretrained("openai/whisper-tiny")

And then push it to your model repo:

generation_config.push_to_hub("hf_id/my_model_name")
PeterGilles commented 1 year ago

Thanks, this fix is working for me too now.

sanchit-gandhi commented 1 year ago

Awesome! Closing this issue as fixed 🤗