KoljaB / RealtimeTTS

Converts text to speech in realtime
1.39k stars 119 forks source link

stop automatic play option in elevenlabs #75

Open GayaaniD opened 2 months ago

GayaaniD commented 2 months ago

I am working on implementing a voicechatbot. So for the text to speech conversion i have used realtimetts library. Here i have chosen elevenlabs engine. But i don't want that audio to play automatically, so i have set the parameter in play function that is muted as True, but still that audio file is playing but this didn't happen with openaiengine, only happening with elevenlabsengine. Why?. I have given my code below for the reference:

from RealtimeTTS import TextToAudioStream, ElevenlabsEngine from dotenv import load_dotenv load_dotenv() import time import datetime import os

text = "How can i help you?" elevenlabs_api_key = os.getenv("ELEVENLABS_API_KEY") def TTS_openai(text): engine = ElevenlabsEngine(elevenlabs_api_key,model='eleven_turbo_v2') voice = 'Liam' engine.set_voice(voice) stream = TextToAudioStream(engine) stream.feed(text) file_path = datetime.datetime.now().strftime("%Y%m%d%H%M%S") + "_speech.webm" stream.play(output_wavfile=file_path,muted=True,fast_sentence_fragment=True) with open(file_path, "rb") as audio_file: binary_data = audio_file.read() return binary_data

if name == "main": TTS_openai(text)

KoljaB commented 2 months ago

Bug, thanks for reporting. I'm currently not updating the lib. Please exchange

                    self.mpv_process.stdin.write(chunk)
                    self.mpv_process.stdin.flush()

with

                    if not self.muted:
                        self.mpv_process.stdin.write(chunk)
                        self.mpv_process.stdin.flush()

in the elevenlabs_engine.py and it should work.

GayaaniD commented 2 months ago

Thankyou for your response. I have created a branch : gayaani/realtimetts-refactor and added the changes, can you merge it

GayaaniD commented 2 months ago

i think, this change didn't work well, after you merge the change , still it plays the audio file

KoljaB commented 1 month ago

Sorry missed set_muted method in the engine, should work with version 0.3.45 now.

GayaaniD commented 1 month ago

sorry, after i installed the version 0.3.45, it is automatically plays the audio, do I need to set any parameters outside

KoljaB commented 1 month ago

Just testet it with 0.3.45.

engine = ElevenlabsEngine(os.environ.get("ELEVENLABS_API_KEY"))
TextToAudioStream(engine).feed(dummy_generator()).play()

Normal audio playout.

engine = ElevenlabsEngine(os.environ.get("ELEVENLABS_API_KEY"))
TextToAudioStream(engine).feed(dummy_generator()).play(muted=True, output_wavfile="test.wav")

With muted=True silent playout into test.wav here, nothing to hear.

So you did "pip install RealtimeTTS==0.3.45" and it still plays the audio over the stereo device?

GayaaniD commented 1 month ago

Sorry, now it's working. But Why does the conversion process take so long when run it locally? Are there specific system specifications that could expedite it?

KoljaB commented 1 month ago

It's a realtime lib. It uses the same playout logic, so this is basically 1:1 realtime too. Outputwavefile is a special use case, I won't optimize for this (makes the code complicated and unclean). You could use every engines python libs directly to do this as fast as the engine can synthesize.

GayaaniD commented 1 month ago

So you mean that, it is better to use elevenlabs or openai library straightly for this requirement right?

KoljaB commented 1 month ago

Yes. For only synthesizing into a file I'd use the APIs directly, because it's faster. output_wavfile parameter was only integrated to verify synthesis.

GayaaniD commented 1 month ago

Thankyou