KoljaB / LocalAIVoiceChat

Local AI talk with a custom voice based on Zephyr 7B model. Uses RealtimeSTT with faster_whisper for transcription and RealtimeTTS with Coqui XTTS for synthesis.
Other
392 stars 38 forks source link

Couqi Engine takes brakes mid sentence to load. #7

Open tomwarias opened 4 months ago

tomwarias commented 4 months ago

Couqi Engine takes brakes mid sentence to load. IT takes sometimes between words or even in the middle of say the word. I tried to adjust setting but nothing works. I use i7 10th and RTX3060 computer.

KoljaB commented 4 months ago

Your GPU should be fast enough for realtime. Is pytorch installed with CUDA?

tomwarias commented 4 months ago

Yes i followed everystep of the readme. I may have problem with cuda because my gpu isn't used by llm model also but dont know how to solve it. I use windows

KoljaB commented 4 months ago

I guess pytorch has no CUDA support. Please check with:

print(torch.cuda.is_available())

If not available, please try to install the latest torch with CUDA version with:

pip install torch==2.2.0+cu118 torchaudio==2.2.0+cu118 --index-url https://download.pytorch.org/whl/cu118

(may need to adjust 118 to your CUDA version, this is for CUDA 11.8)

To use GPU with LLM under windows you need to compile llama-cpp-python for CUBLAS:

After that install and compile llama-cpp with:

pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir --verbose

After that you can set n_gpu_layers in the creation parameters of llama.cpp to define how many layers of the llm neural network should be offloaded on the GPU.

tomwarias commented 4 months ago

I did it and it still does that, and I am also unable to dowland llama_cpp on those set CMAKE_ARGS=-DLLAMA_CUBLAS=on

KoljaB commented 4 months ago

What's the result of print(torch.cuda.is_available())? Both torch and llama.cpp have to run with CUDA (GPU supported) to achieve realtime speed.

The above installation way for llama.cpp works for on my Windows 10 system, if it fails on yours I'm not sure how I can offer further support. llama.cpp not my library and it can be a complex issue.