evancohen / smart-mirror

The fairest of them all. A DIY voice controlled smart mirror with IoT integration.
http://smart-mirror.io
2.78k stars 694 forks source link

Text-To-Speech #386

Open decentralgabe opened 7 years ago

decentralgabe commented 7 years ago

Continuing what I was working on in #280.

I have read when you (Evan) have written about TTS in #78, and agree that having a speech response to most commands would be useless; however, when asking "what time is it?" I think it would be a nice feature if the mirror spoke the time at the user, though the "what time is it" feature should be fixed to display the time on the mirror too if any other services are open. There are other cases too where TTS could be useful down the line – speaking results from the Wolfram Alpha API, announcing sports scores, timed reminders and so on.

I had been looking into the SpeechSynthesis API; however, it appears that Google has dropped support for the API within embedded/dev Chromium environments as of the other month. The problem isn't with the SpeechSynthesis commands itself (they are available within the mirror), but there are no voices available for the API to work with (evident by a call to speechSynthesis.getVoices()). Here's a demo of the API in action using Angular.

I had looked into a few other user's projects that make use of the Speech API (they too do not work in the mirror), including Artyom, and Speech-Synthesis.

What I will attempt next is to work with a few of the Linux-specific TTS solutions outlined here.

Edit: Might be worth noting that SpeechSynthesisUtterances work when I run the mirror code on my Mac, but does not work on the RPi.

decentralgabe commented 7 years ago

I've been working with getting Festival to speak commands, which work via the terminal, and would work through Vocal; however, ALSA does not allow multiple sound sources (the mirror and festival) to go at the same time.

I've been playing around with the .asoundrc file to see if I can get DMIX to work, but I don't understand this stuff too well yet.

By the way, here's the DMIX guide I've been trying to figure out.

evancohen commented 7 years ago

Sorry for the delay in response here! You've answered your own question, DMIX is the way to go. It would be helpful to know what issues you've been having trying to figure it out.

Just taking a stab in the dark here, but if you are using the provided ~/.asoundrc file then you should just be able to update the plugged pcm device, for instance:

pcm.pluged {
    type dmix
    ipc_key 1024
    #this is your output device
    slave.pcm "hw:0,1"
}
decentralgabe commented 7 years ago

EDIT

Working asoundrc...


pcm.!default {
      type asym
      playback.pcm {
          type plug
          # This is your output device
          slave.pcm "hw:0,0"
      }
      capture.pcm {
          type plug
          # Input device
          slave.pcm "hw:1,0"
      }
}
evancohen commented 7 years ago

With the updated speech recognition work that I just completed, my recommendation would be to use Say: https://github.com/Marak/say.js

Troublesome audio configuration should be a thing of the past (I still have to verify this though).

decentralgabe commented 7 years ago

I had tried say back when I started down this rabbit hole. Just tried it again with the updated mirror code, and it has no output sound, no error messages. I'm guessing there's still a dmix issue.

evancohen commented 7 years ago

@glcohen have you followed the updated documentation? I tried Say with Sonus a few weeks ago and it worked without a hitch.

decentralgabe commented 7 years ago

Updated asoundrc worked!

Should I submit a pull request?

justbill2020 commented 7 years ago

what's the status on this one... @glcohen were you able to submit a pull request to the dev branch can this and #280 be closed? also i'm concerned with the asoundrc info above leading people down the wrong troubleshooting path... can you please update the asoundrc file that is working for you?

decentralgabe commented 7 years ago

Just submitted pull request here. #280 can be closed.