Vaibhavs10 / insanely-fast-whisper

Apache License 2.0
7.79k stars 547 forks source link

Get !!!!! in output.json file #163

Open kamil6x opened 10 months ago

kamil6x commented 10 months ago

Running on Macbook pro 2019, intel 8 core, AMD Radeon RX 580 egpu, MacOS 12.6.7.

insanely-fast-whisper --model-name distil-whisper/distil-large-v2 --device-id mps --batch-size 4 --file-name 'voicemail_greeting_recorging.wav' /usr/local/Caskroom/miniconda/base/envs/insanely-fast-whisper/lib/python3.11/site-packages/pyannote/audio/core/io.py:43: UserWarning: torchaudio._backend.set_audio_backend has been deprecated. With dispatcher enabled, this function is no-op. You can remove the function call. torchaudio.set_audio_backend("soundfile") /usr/local/Caskroom/miniconda/base/envs/insanely-fast-whisper/lib/python3.11/site-packages/torch_audiomentations/utils/io.py:27: UserWarning: torchaudio._backend.set_audio_backend has been deprecated. With dispatcher enabled, this function is no-op. You can remove the function call. torchaudio.set_audio_backend("soundfile") Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. 🤗 Transcribing... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0:00:44Whisper did not predict an ending timestamp, which can happen if audio is cut off in the middle of a word. Also make sure WhisperTimeStampLogitsProcessor was used during generation. 🤗 Transcribing... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0:00:44 Voila!✨ Your file has been transcribed go check it out over here 👉 output.json

Output.json:

{"speakers": [], "chunks": [{"timestamp": [null, null], "text": "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"}], "text": "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"}

czyz commented 2 months ago

Same. On mac and using the PYTORCH_ENABLE_MPS_FALLBACK=1 environment variable because --device-id-mps otherwise just causes failure with an error on machines with AMD Radeon Pro Vega II 32Gb graphics.

insanely-fast-whisper --language English --file-name mywav.wav --device-id mps

results in a whole bunch of transcriptions in the output of single exclamation points, and then a couple of very long strings of exclamation points followed by "Thank you." at the end of output.json.