SYSTRAN / faster-whisper

Faster Whisper transcription with CTranslate2
MIT License
12.23k stars 1.03k forks source link

Segmentation fault on word_timestamps using custom model #58

Closed giannhskp closed 1 year ago

giannhskp commented 1 year ago

I have fine-tuned a whisper model (large) using my own dataset. The model works great during inference with the original openai/whisper code. The model can also detect word_timestamps after the new feature on the openai/whisper repo.

I can successfully use this model with faster-whisper without word_timestamps. When i try to use word_timestamps i receive: Segmentation fault (core dumped). If i use the original whisper large model, word_timestamps work as expected.

Any idea why this is happening?

guillaumekln commented 1 year ago

Thanks for reporting the issue.

Is it possible for you to share this model and input audio?

If not, can you locate exactly where the crash is happening in the code? I guess it's happening in this function:

https://github.com/guillaumekln/faster-whisper/blob/0ab8db2b3790b27feb4ff0b220d606390bb8cda7/faster_whisper/transcribe.py#L595-L601

If that's the case, can you try to edit the code and print the argument values for this function?

guillaumekln commented 1 year ago

Closing for now as the crash was not reported by anyone else. Feel free to reopen if you can provide more information.

blaueente commented 2 months ago

The very same thing happened to me as well, but I can share model and input audio.

The model has been converted from hugginface: https://huggingface.co/primeline/distil-whisper-large-v3-german

$ ct2-transformers-converter --model primeline/distil-whisper-large-v3-german --output_dir whisper-large-v3-ct2-german --copy_files preprocessor_config.json --quantization int8

This is the input audio: https://cdn.media.ccc.de/events/froscon/2024/opus/froscon2024-3116-deu-20_Jahre_OpenStreetMap_opus.opus

Here the crash happens:

$ ./bin/whisper-ctranslate2 --language de --model_directory  whisper-large-v3-ct2-german  --print_colors TRUE froscon2024-3116-deu-20_Jahre_OpenStreetMap_opus.opus 
Print colors requires word-level time stamps. Generated files in output directory will have word-level timestamps
Detected language 'German' with probability 1.000000
Segmentation fault (core dumped)

When I remove --print_colors, there is no crash and the process works. When I use a standard model (e.g., medium, or large-v3), there is not crash even with --print_colors, and the process works as well.