Open paulz opened 6 months ago
same
Same for me, how can I solve this? @Vaibhavs10
+1
Running this: pip install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
should solve the issue, but the following error occurs:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. pyannote-audio 3.2.0 requires torchaudio>=2.2.0, but you have torchaudio 2.2.0.dev20240529 which is incompatible.
Sorry for the delay in responding to this, given the current constraints of requirements, AFAIK we'll need to wait for the next stable torch version release (which should be soon).
On pytorch nightly build 2.4.0, this bug is fixed. try uninstall torch and reinstall with pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cpu
I ran
pip uninstall torch torchvision torchaudio
pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cpu
And reattempted but still encountered this error
I got this working within miniconda ( https://docs.anaconda.com/miniconda/miniconda-install/ ) --
conda create -n insane-whisper python=3.12 -y
conda activate insane-whisper
pip3 uninstall torch torchvision torchaudio
pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cpu
pip3 install insanely-fast-whisper
Now it works well within this conda env. Note that even though I also still have it installed outside of conda, within conda the correct version will be run:
$ which -a insanely-fast-whisper
~/miniconda3/envs/insane-whisper/bin/insanely-fast-whisper
~/.local/bin/insanely-fast-whisper
This error used to happen for me on macOS, but I just retried it, and it seems to work fine now. I was/am running the following command:
$ insanely-fast-whisper --device-id mps --file-name foo.wav
🤗 Transcribing... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0:00:00You have passed task=transcribe, but also have set `forced_decoder_ids` to [[1, None], [2, 50360]] which creates a conflict. `forced_decoder_ids` will be ignored in favor of task=transcribe.
🤗 Transcribing... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0:00:01Passing a tuple of `past_key_values` is deprecated and will be removed in Transformers v4.43.0. You should pass an instance of `EncoderDecoderCache` instead, e.g. `past_key_values=EncoderDecoderCache.from_legacy_cache(past_key_values)`.
🤗 Transcribing... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0:00:01The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
🤗 Transcribing... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0:00:12
Voila!✨ Your file has been transcribed go check it out over here 👉 output.json
I'm running insanely-fast-whisper 0.0.15, but I'm not sure what version I was running when it failed.
That said, while it is indeed running on the GPU, it's still slightly slower than whisper.cpp with large-v3. Not sure if that means that something's going wrong.
and using
PYTORCH_ENABLE_MPS_FALLBACK=1
cause very slow performance: