ILikeAI / AlwaysReddy

AlwaysReddy is a LLM voice assistant that is always just a hotkey away.
MIT License
524 stars 50 forks source link

The audio file temp_recording.wav was not found. error message has occured. #18

Closed kalstein00 closed 2 months ago

kalstein00 commented 2 months ago

My environment: Windows 10, rtx 4090, 32GB ram

Error Message: Recording saved to audio_files\temp_recording.wav Transcribing audio file: audio_files\temp_recording.wav An error occurred during transcription: The audio file temp_recording.wav was not found.

I performed setup items 1-8 and set all configs to local. (TRANSCRIPTION_API = "whisperx", COMPLETIONS_API = "ollama") What I said is well saved in 'AlwaysReddy\audio_files\temp_recording.wav'. I'm not sure why that error message appears.

wogagr commented 2 months ago

I had the same problem. For me it was a problem with the ffmpeg installation.

conda install -c conda-forge ffmpeg

fixed it.

kalstein00 commented 2 months ago

I had the same problem. For me it was a problem with the ffmpeg installation.

conda install -c conda-forge ffmpeg

fixed it.

i dont use conda. then how can i resolve that?

kaminoer commented 2 months ago
# on Ubuntu or Debian
sudo apt update && sudo apt install ffmpeg

# on Arch Linux
sudo pacman -S ffmpeg

# on MacOS using Homebrew (https://brew.sh/)
brew install ffmpeg

# on Windows using Chocolatey (https://chocolatey.org/)
choco install ffmpeg

# on Windows using Scoop (https://scoop.sh/)
scoop install ffmpeg
sirPhoebus commented 2 months ago

That did not help me .....Anyone else have this issue also ?

f:\ZONE PROJ\CHAT-TTS\AlwaysReddy\areddy\lib\site-packages\pyannote\audio\core\io.py:43: UserWarning: torchaudio._backend.set_audio_backend has been deprecated. With dispatcher enabled, this function is no-op. You can remove the function call. torchaudio.set_audio_backend("soundfile") Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.2.3. To apply the upgrade to your files permanently, run python -m pytorch_lightning.utilities.upgrade_checkpoint C:\Users\pscho\.cache\torch\whisperx-vad-segmentation.bin Model was trained with pyannote.audio 0.0.1, yours is 3.1.1. Bad things might happen unless you revert pyannote.audio to 0.x. Model was trained with torch 1.10.0+cu102, yours is 2.3.0+cu118. Bad things might happen unless you revert torch to 1.x.Using WhisperX model: tiny and device: cuda Press 'ctrl + shift + space' to start recording, press again to stop and transcribe. Double tap to give the AI access to read your clipboard. Press 'ctrl + alt + x' to cancel recording. Press 'ctrl + alt + f12' to clear the chat history. use_clipboard: False Starting recording... Recording started... use_clipboard: False Stopping recording... Recording saved to audio_files\temp_recording.wav Transcribing audio file: audio_files\temp_recording.wav Transcription: Who are you today? Running TTS: I am your mischievous assistant today. Adding to queue Error reading file f:\ZONE PROJ\CHAT-TTS\AlwaysReddy\audio_files\tmp0igjac9k.wav: Error opening 'f:\ZONE PROJ\CHAT-TTS\AlwaysReddy\audio_files\tmp0igjac9k.wav': Format not recognised.

kaminoer commented 2 months ago

@sirPhoebus Looks like you are facing a different issue. Your transcription is fine - the issue is with TTS. Assuming you are using piper, I'd check if everything is ok with it. Try running a command to check if it's generating audio files properly: echo 'Hello. This is a test.' | \ ./piper --model en_US-amy-medium.onnx --output_file test.wav

If it generates the output file and doesn't throw any error, open the wav file and see if it plays the audio.

sirPhoebus commented 2 months ago

Thanks @kaminoer !! In the meantime I found the post on alltalk_tts . I tried it ...working like a charm and I am very happy with it !! I would advise to try it also ;)

kaminoer commented 2 months ago

Happy to hear alltalk is working for you! I've been using it ever since I opened that issue and I'm also really satisfied with the results. Not as quick as piper but the improved audio quality makes up for it imo. If you are running an nvidia GPU, I recommend enabling DeepSpeed in alltalk to shave off a couple second on generation time.

sirPhoebus commented 2 months ago

oooh wow...thanks for the tip...Will look into it right away !