Open ther3zz opened 7 months ago
@daswer123 Could you please estimate the complexity of forming this endpoint in your product ? and specify where to do it, I can try to do it with DeepSeek coder in PyCharm.
about last endpoint in this API , only this need to me and community of HomeAssistant.It is a smart home product and it has a handy assistant. It requires STT, LLM, function, TTS. And preferably with an api compatible with OpenAI. STT - whisper.cpp LLM llama.cpp TTS ??? (alltalk_tts has big overhead, localai buggy, silero deprecated).
/process?INPUT_TEXT=..text..&INPUT_TYPE=TEXT&LOCALE=[locale]&VOICE=[name]&OUTPUT_TYPE=AUDIO&AUDIO=WAVE_FILE - Processes the text and returns a wav file. We can probably ignore INPUT_TYPE, OUTPUT_TYPE and AUDIO as I've never seen any program using a different setting.
@daswer123 Could you please estimate the complexity of forming this endpoint in your product ? and specify where to do it, I can try to do it with DeepSeek coder in PyCharm.
about last endpoint in this API , only this need to me and community of HomeAssistant.It is a smart home product and it has a handy assistant. It requires STT, LLM, function, TTS. And preferably with an api compatible with OpenAI. STT - whisper.cpp LLM llama.cpp TTS ??? (alltalk_tts has big overhead, localai buggy, silero deprecated).
/process?INPUT_TEXT=..text..&INPUT_TYPE=TEXT&LOCALE=[locale]&VOICE=[name]&OUTPUT_TYPE=AUDIO&AUDIO=WAVE_FILE - Processes the text and returns a wav file. We can probably ignore INPUT_TYPE, OUTPUT_TYPE and AUDIO as I've never seen any program using a different setting.
@neowisard So there's a different project (openedai-speech) which works well with this home assistant HACS integration openai_tts fork.
If you want to be able to type your own model/voice values, take a look at this openai_tts PR
Thx !
Just to be clear, I have tested both api's and despite the fact that they use almost identical engines and models.
xtts-api-server with deepseed (12-13sec) on my Tesla P40 is slightly faster than openedai-speech with enabled deepspeed (18-21 sec).
and both has some memory leak (on vgpu) . I just forked and tweaked it for me. it is in my repo.
Hello,
Would it be possible to write MarryTTS compatibility into this (similar to what coqui-tts has)?
The specific intent here is to provide compatibility with Home Assistant