Closed kalstein00 closed 2 months ago
I had the same problem. For me it was a problem with the ffmpeg installation.
conda install -c conda-forge ffmpeg
fixed it.
I had the same problem. For me it was a problem with the ffmpeg installation.
conda install -c conda-forge ffmpeg
fixed it.
i dont use conda. then how can i resolve that?
# on Ubuntu or Debian
sudo apt update && sudo apt install ffmpeg
# on Arch Linux
sudo pacman -S ffmpeg
# on MacOS using Homebrew (https://brew.sh/)
brew install ffmpeg
# on Windows using Chocolatey (https://chocolatey.org/)
choco install ffmpeg
# on Windows using Scoop (https://scoop.sh/)
scoop install ffmpeg
That did not help me .....Anyone else have this issue also ?
f:\ZONE PROJ\CHAT-TTS\AlwaysReddy\areddy\lib\site-packages\pyannote\audio\core\io.py:43: UserWarning: torchaudio._backend.set_audio_backend has been deprecated. With dispatcher enabled, this function is no-op. You can remove the function call.
torchaudio.set_audio_backend("soundfile")
Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.2.3. To apply the upgrade to your files permanently, run python -m pytorch_lightning.utilities.upgrade_checkpoint C:\Users\pscho\.cache\torch\whisperx-vad-segmentation.bin
Model was trained with pyannote.audio 0.0.1, yours is 3.1.1. Bad things might happen unless you revert pyannote.audio to 0.x.
Model was trained with torch 1.10.0+cu102, yours is 2.3.0+cu118. Bad things might happen unless you revert torch to 1.x.Using WhisperX model: tiny and device: cuda
Press 'ctrl + shift + space' to start recording, press again to stop and transcribe.
Double tap to give the AI access to read your clipboard.
Press 'ctrl + alt + x' to cancel recording.
Press 'ctrl + alt + f12' to clear the chat history.
use_clipboard: False
Starting recording...
Recording started...
use_clipboard: False
Stopping recording...
Recording saved to audio_files\temp_recording.wav
Transcribing audio file: audio_files\temp_recording.wav
Transcription:
Who are you today?
Running TTS: I am your mischievous assistant today.
Adding to queue
Error reading file f:\ZONE PROJ\CHAT-TTS\AlwaysReddy\audio_files\tmp0igjac9k.wav: Error opening 'f:\ZONE PROJ\CHAT-TTS\AlwaysReddy\audio_files\tmp0igjac9k.wav': Format not recognised.
@sirPhoebus Looks like you are facing a different issue. Your transcription is fine - the issue is with TTS. Assuming you are using piper, I'd check if everything is ok with it. Try running a command to check if it's generating audio files properly:
echo 'Hello. This is a test.' | \ ./piper --model en_US-amy-medium.onnx --output_file test.wav
If it generates the output file and doesn't throw any error, open the wav file and see if it plays the audio.
Thanks @kaminoer !! In the meantime I found the post on alltalk_tts . I tried it ...working like a charm and I am very happy with it !! I would advise to try it also ;)
Happy to hear alltalk is working for you! I've been using it ever since I opened that issue and I'm also really satisfied with the results. Not as quick as piper but the improved audio quality makes up for it imo. If you are running an nvidia GPU, I recommend enabling DeepSpeed in alltalk to shave off a couple second on generation time.
oooh wow...thanks for the tip...Will look into it right away !
My environment: Windows 10, rtx 4090, 32GB ram
Error Message: Recording saved to audio_files\temp_recording.wav Transcribing audio file: audio_files\temp_recording.wav An error occurred during transcription: The audio file temp_recording.wav was not found.
I performed setup items 1-8 and set all configs to local. (TRANSCRIPTION_API = "whisperx", COMPLETIONS_API = "ollama") What I said is well saved in 'AlwaysReddy\audio_files\temp_recording.wav'. I'm not sure why that error message appears.