Closed giannhskp closed 1 year ago
Thanks for reporting the issue.
Is it possible for you to share this model and input audio?
If not, can you locate exactly where the crash is happening in the code? I guess it's happening in this function:
If that's the case, can you try to edit the code and print the argument values for this function?
Closing for now as the crash was not reported by anyone else. Feel free to reopen if you can provide more information.
The very same thing happened to me as well, but I can share model and input audio.
The model has been converted from hugginface: https://huggingface.co/primeline/distil-whisper-large-v3-german
$ ct2-transformers-converter --model primeline/distil-whisper-large-v3-german --output_dir whisper-large-v3-ct2-german --copy_files preprocessor_config.json --quantization int8
This is the input audio: https://cdn.media.ccc.de/events/froscon/2024/opus/froscon2024-3116-deu-20_Jahre_OpenStreetMap_opus.opus
Here the crash happens:
$ ./bin/whisper-ctranslate2 --language de --model_directory whisper-large-v3-ct2-german --print_colors TRUE froscon2024-3116-deu-20_Jahre_OpenStreetMap_opus.opus
Print colors requires word-level time stamps. Generated files in output directory will have word-level timestamps
Detected language 'German' with probability 1.000000
Segmentation fault (core dumped)
When I remove --print_colors, there is no crash and the process works. When I use a standard model (e.g., medium, or large-v3), there is not crash even with --print_colors, and the process works as well.
I have fine-tuned a whisper model (large) using my own dataset. The model works great during inference with the original openai/whisper code. The model can also detect word_timestamps after the new feature on the openai/whisper repo.
I can successfully use this model with faster-whisper without word_timestamps. When i try to use word_timestamps i receive:
Segmentation fault (core dumped)
. If i use the original whisper large model, word_timestamps work as expected.Any idea why this is happening?