Almost there, just TTS with alltalk_tts left!

semperai / amica

Amica is an open source interface for interactive communication with 3D characters with voice synthesis and speech recognition.

MIT License

726 stars 120 forks source link

Hello guys! First of all, thank you for this integrated software which has really been thought trough and which is so modular! GREAT JOB!!! :smiley:

Context

Ubuntu 24.04.01 LTS
Python 3.12.3

Goal I try to achieve a complete local installation with:

OK ChatBot: llama.cpp (however with the original github instructions, because Amica's instructions are mostly outdated)
OK STT: whisper.cpp (however with the original github instructions, because Amica's instructions are mostly outdated)
WIP TTS: alltalk_tts

Symptoms

a) Basically, lipsync.ts has issues processing the stream properly:

2024-11-09_14-18 Amica error message when getting audio back

2024-11-09_16-35 Browser debugging tools

b) all the user's input text/voice lands in a single chat bubble instead of separate bubbles for the user and Amica's answers are missing:

2024-11-09_14-17 Amica no messages written and played back

Debugging so far with alltalk_tts

KO: alltalk_tts installation according to Amica instructions. Result: Errors at launch, PortAudio not recognized though installed, Amica input is sent to TTS, but not spoken.

2024-11-09_15-09 AMica Specific alltalk_tts conda installation

KO: alltalk_tts v1.9c (main) installation according to alltalk_tts author has the same issues:

2024-11-09_15-38 alltalk_tts_v1 Receives input but no output sound

WIP: However, after debugging with the alltalk_tts author directly, he updated alltalk_tts v2 beta and made a specific wiki page for Amica. The result is now, that no more errors appear after alltalk_tts v2 beta launch, Amica's output is processed by the TTS in text and audio form:

2024-11-09_13-57 alltalk_tts_v2_beta output

I tinkered a bit with alltalk_tts API global settings, but with no success:

2024-11-09_15-18 alltalk_tts API Settings tweaks

That's about how far I can go with my skills. Could you please have a look and tell me what I'm missing?

Hi @CodeShadower Please see this PR I made (for Amica, this is not an update to AllTalk) https://github.com/semperai/amica/pull/141

You can manually download/replace the files if you wish, replacing the existing ones, all sub the src directory:

https://github.com/erew123/amica/blob/master/src/features/localXTTS/localXTTS.ts https://github.com/erew123/amica/blob/master/src/components/settings/LocalXTTSSettingsPage.tsx https://github.com/erew123/amica/blob/master/src/utils/config.ts

Make sure you set AllTalk V2 to use the AllTalk V2 API and select it in the Amica interface as AllTalk V2 API Protocol. You will now have a few more features available and audio is pulled to your browser, rather than playing in the console/terminal.

As long as you can generate TTS within AllTalk V2 on the Gradio generation page, things should work.

Thanks

semperai / amica

Almost there, just TTS with alltalk_tts left! #140