nateshmbhat / pyttsx3

Offline Text To Speech synthesis for python
Mozilla Public License 2.0
2.12k stars 333 forks source link

How to output voice non-blocking without having to wait for runAndWait() #67

Closed ccarstens closed 5 years ago

ccarstens commented 5 years ago

pyttsx3 Version: 2.71 macOS: 10.14.6

I am writing a voice assistant that has to output voice and then immediately has to listen to the user. The assistant is based on Spade which uses Asyncio.

When I output voice using say("sth") and runAndWait() there is quite a long time that I have to wait for runAndWait() to finish (about 1100ms). I would like to be able to output speech and immediately when the output is finished continue to do other things, like listening to what the user says.

When I use startLoop() I can't continue with anything until I called endLoop() which also delays everything.

I don't really understand how startLoop(False) would help me. Executing engine.iterate() myself works, and I receive speech output, but there too I don't know how to continue with other operations after speech output finished

How would I setup pyttsx3 in a way where I can output speech at anytime without a delay afterwards?

I tried to place endLoop() in a separate deamon thread so I don't need to wait for it to finish. But in this case say() never returns.

Thank you for your help! Here is my code:

output.py

import pyttsx3
from ava.utterance import Utterance
from log import log_output as log
import threading

class Output:
    def __init__(self):
        self.synthesizer = pyttsx3.init(debug=True)
        self.voice = list(filter(lambda sv: ('Tracy' in sv.name),
                                 self.synthesizer.getProperty('voices')))[0]
        self.synthesizer.setProperty('voice', self.voice.id)
        self.synthesizer.setProperty('rate', 175)

        self.setup_callbacks()

    def setup_callbacks(self):
        self.synthesizer.connect('finished-utterance', self.on_finished_utterance)
        self.synthesizer.connect('started-word', self.on_started_word)

    def speak(self, utterance: Utterance):
        self.synthesizer.say(utterance.body, utterance.name)
        self.synthesizer.startLoop()

    def on_finished_utterance(self, name, completed):
        log.debug("END")
        t = threading.Thread(name="ccpyttsx3", target=self.killme, args=(self.synthesizer,))
        t.setDaemon(True)
        t.start()

    def on_started_word(self, name, location, length):
        pass

    def killme(self, synth):
        print("killme")
        synth.endLoop()
ccarstens commented 5 years ago

I solved my own problem by putting pyttsx3 and the speech recognition in their own process while using the 'finished-utterance' callback for initiating the speech recognition.

raffals commented 4 years ago

Can you please post your code with pyttsx3 in its own process?

ccarstens commented 4 years ago

Hi @raffals you can check this out: https://github.com/ccarstens/Ava/blob/dev/ava/environment.py https://github.com/ccarstens/Ava/blob/dev/ava/iocontroller.py

In environment the process is set up and in iocontroller the strings that should be synthesised using pyttsx are received via the Queue

raffals commented 4 years ago

Thanks, your offering this is much appreciated! Looks like some beautiful code and classes you’ve set up. As I’m new to python, tho, this looks over my head currently.

My issue was that after pyttsx3 said something via runAndWait(), it crashed the app with "Fatal Python error: PyEval_RestoreThread: NULL tstat”.

As I thought the issue was a conflict with wxPython GUI, I swapped that out for PySimpleGUI. But, given I’m still saw the same behavior, I’m now thinking it’s actually due to a bug in the OSX version of pyobjc/PyObjCTools.

So, my thought was that I’d need to avoid runAndWait() to use pyttsx3 effectively, which was accomplished by just replacing the runAndWait() call with calls to startLoop(False) just after init, and then using say() and iterate() each time there’s something I want spoken. Much faster than runAndWait(), too!

I may circle back around when I have more experience and feel ready to try to decipher your code, as the Queue/IOcontroller route is a much more elaborate process. In the meantime, I’ve also found gTTS sounds even nicer than pyttsx3 (with the drawbacks of requiring online access and writing to an .mp3 file).

On Apr 13, 2020, at 10:20 AM, Cornelius Carstens notifications@github.com wrote:

Hi @raffals https://github.com/raffals you can check this out: https://github.com/ccarstens/Ava/blob/dev/ava/environment.py https://github.com/ccarstens/Ava/blob/dev/ava/environment.py https://github.com/ccarstens/Ava/blob/dev/ava/iocontroller.py https://github.com/ccarstens/Ava/blob/dev/ava/iocontroller.py In environment the process is setup and in iocontroller the strings that should be synthesised using pyttsx are received via the Queue

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nateshmbhat/pyttsx3/issues/67#issuecomment-612997727, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAI26UDCGB7UXHK6T2V3CKTRMNCVJANCNFSM4IWVGFVQ.


Thought for Today: Where do you fit in to the grand scheme of things? You won’t figure it out today; you figure it out every day... ~ Greg Lamboy

ccarstens commented 4 years ago

So, my thought was that I’d need to avoid runAndWait() to use pyttsx3 effectively, which was accomplished by just replacing the runAndWait() call with calls to startLoop(False) just after init, and then using say() and iterate() each time there’s something I want spoken. Much faster than runAndWait(), too!

I'd also be interested in finding a more performant way of using speech synthesis. If I remember correctly I still had to account for delays happening. Feel free to share your code using gTTS!

AlluDaddy commented 3 years ago

Sir i am facing a delay reply from my assistant. could you please give me a suggestion for fast reply.

import pyttsx3 import speech_recognition as sr import pyaudio import pywhatkit import datetime import wikipedia

listener=sr.Recognizer() engine=pyttsx3.init() engine.say("Hello I am siri") engine.runAndWait() voices=engine.getProperty('voices') engine.setProperty('voice',voices[1].id) volume = engine.getProperty('volume') engine.setProperty('volume', 10.0) rate = engine.getProperty('rate') engine.setProperty('rate', rate + 25) def talk(command): engine.say(command) engine.runAndWait() def taking_cmd(): command="" try: with sr.Microphone() as source: print('Listening......') voice=listener.listen(source) command=listener.recognize_google(voice) command=command.lower() print("ok")

        if "siri" in command:

            command=command.replace("siri","")
            print(command)
            talk(command)
except:
    pass

return command

def run(): command=taking_cmd() if 'play' in command: command=command.replace("play","") talk("playing" + command) pywhatkit.playonyt(command) elif 'time' in command: time=datetime.datetime.now().strftime('%I:%M %p') talk("Now the time is"+time) elif 'who the heck ' in command: wiki=command.replace('who the heck is',"") info = wikipedia.summary(wiki,1) print(info) talk(info) else: talk("sorry i cant understand please repeat it again")

run()