Mozer / talk-llama-fast

Port of OpenAI's Whisper model in C/C++ with xtts and wav2lip
MIT License
704 stars 62 forks source link

Is online AI models a possibility via OpenRouter ? #23

Open alex77pp77 opened 2 months ago

alex77pp77 commented 2 months ago

Your work is absolutely amazing, I love it. The problem is 8b models are too much limited. Would it be possible to include a possibility to connect with any online models with OpenRouter for example ? Llama 70B is very fast to answer and it's a lot better than the 8b model ...

Thanks for the great work, very appreciated.

Mozer commented 2 months ago

You always can code it yourself.

As a workaround - you can use openRouter in SillyTavern + my wav2lip. It won't have speech to text, but it will be showing video replies:

You need to run both xtts_wav2lip.bat and my modified silly_extras to make it work with Silly tavern. Turn off streaming for openrouter and for XTTS in SillyTavern. Streaming is not yet supported in ST for wav2lip.

alex77pp77 commented 2 months ago

You always can code it yourself.

As a workaround - you can use openRouter in SillyTavern + my wav2lip. It won't have speech to text, but it will be showing video replies:

You need to run both xtts_wav2lip.bat and my modified silly_extras to make it work with Silly tavern. Turn off streaming for openrouter and for XTTS in SillyTavern. Streaming is not yet supported in ST for wav2lip.

Thank's for the answer, I've been trying to code it with the help of chat GPT-4 since a few days but I can't get the speed you've got to generate the TTS. I've been using the xtts-api-server and I've tried with your SillyTavern + wavelip set up, but it's still slow. I just code with python, not with C. Also I've got trouble to interrupt the TTS when it's playing, I can't get it to stop immediately ...

Mozer commented 2 months ago

You need to programmatically put 0 into temp/xtts_play_allowed.txt to stop it. And then 1 when it is ok to speak again.

alex77pp77 commented 2 months ago

You need to programmatically put 0 into temp/xtts_play_allowed.txt to stop it. And then 1 when it is ok to speak again.

Ok, thank you