Pedal-Intelligence / saypi-userscript

An independent voice interface for Inflection AI's conversational assistant, Pi
https://www.saypi.ai/
Other
15 stars 3 forks source link

Multilingual Speech Output #60

Closed rosscado closed 3 days ago

rosscado commented 5 months ago

Pi is capable of generating text responses in non-English languages. However, when Pi reads those text responses aloud, it speaks with an English accent, whatever the language of the text.

This appears to be due to Inflection using one of ElevenLab's English-only voice synthesis models.

elevenlabs english tts models

Either of ElevenLab's multilingual models give far better spoken results on the same text input.

elevenlabs multilingual tts models

Attempt to override Pi's voice synthesis for non-English languages (only), and substitute our own using a multilingual TTS model.

rosscado commented 5 months ago

A voice synthesis request by Inflection looks like this, with an audio/mpeg response type. GET https://pi.ai/api/chat/voice?mode=eager&voice=voice4&messageSid=PzSgpCg8qxYFcRZsVmw2X

rosscado commented 3 days ago

Closed with the release of multilingual voices in v1.6.0. 🎉