Feature: IBM Bluemix Text To Speech

thecodingwizard commented 8 years ago

Feature request: TTS service from IBM Bluemix.

Check out the demo. It has great features like SSML and different voices, expressions etc.

I got the basic prototype working. Check it out: https://github.com/thecodingwizard/mycroft-core/tree/features/ibm-tts.

There are a few basic steps needed to get it working:

Git clone my repository: https://github.com/thecodingwizard/mycroft-core/tree/features/ibm-tts
Install watson-developer-sdk. The following worked for me (Ubuntu 64 bit running in a VM):
- CD into the virtualenv directory: cd ~/.virtualenvs/mycroft/
- Pip install: bin/pip install watson-developer-cloud
Make some configuration changes EDIT: if you git cloned my repository, the changes should already be in place. However, make sure to edit the username and password!:
- CD into the configuration directory: cd /path/to/mycroft-core/mycroft/configuration/
- Edit the file with the following changes:

Before:

[tts]
module = "mimic"
mimic.voice = "mycroft/tts/mycroft_voice_4.0.flitevox"
espeak.lang = "english-us"
espeak.voice = "m1"

After:

[tts]
module = "ibmtts"
ibmtts.voice = "en-US_AllisonVoice"
ibmtts.username = "" # REPLACE #
ibmtts.password = "" # REPLACE #
ibmtts.timeout = 20

Get the credentials:
- Head over to bluemix.net, sign up for an account, go here, create a TTS service, then find the credentials (Dashboard -> TTS Service -> Service Credentials), and insert the credentials in the corresponding location
Try it out! The simplest way to do so:
- Open a new terminal. cd into mycroft-core. Run ./start.sh service
- Open a new terminal. cd into mycroft-core. Run ./start.sh skills
- Open a new terminal. cd into mycroft-core. Run ./start.sh cli
- In the last terminal that you opened, type "hello" into the console. See the result.

Note: I haven't actually tried the above steps and it's been a while since I first did them, so there may be a few issues in the instructions. If you find any, please leave a comment and I'll try to fix it as soon as possible.

Currently, the free tier of Watson TTS is 1 million characters per month (which is a lot). The only problem may be dictating books; lazy coders may choose to load the entire book at once into the TTS engine. Anything over 1 million characters is pay-per-use. If memory serves me right it's something like 2 cents for 1 thousand characters.

It also takes some time to load the dictation for longer sentences. The script times out after 20 seconds but this is adjustable. After timing out (or running into an error, like no internet), it falls back to the fallback TTS engine which the user can specify. This will be implemented soon (hopefully).

This will be my first major pull request and I'm very new to mycroft, so there will probably be a lot of bugs :sweat_smile:

This feature will use another library, watson-developer-cloud. How do I tell mycroft to install a library?

The TTS system is not quite complete, so I won't make a PR yet. Some things that still need to be done, in no particular order:

[x] Clean up code
[ ] Add Voice Validation
[ ] Add Result Validation
[x] Add Backup Feature
[x] Add configuration for timeout time
[ ] Write tests (though this will probably never happen, I'm still going to include it :wink:)
Nathan

marksev1 commented 8 years ago

Is an account at bluemix also for free? :-D

thecodingwizard commented 8 years ago

@marksev1 Yup!

chrisvella commented 8 years ago

Thank you for developing this fantastic feature. I can't wait to get it working.

A couple of things to note:

To merge the branch use: git clone -b features/ibm-tts https://github.com/thecodingwizard/mycroft-core.git
The python modules name is watson-developer-cloud rather than watson-developer-sdk
The file to edit the configuration is: /mycroft/configuration/mycroft.ini

I am hitting an issue with the last step:

Open a new terminal. cd into mycroft-core. Run ./start.sh cli

I get the following error:

Traceback (most recent call last):
  File "/opt/mycroft/ibm-tts/mycroft-core/mycroft/client/text/cli.py", line 24, in <module>
    from mycroft.tts import tts_factory
  File "/opt/mycroft/ibm-tts/mycroft-core/mycroft/tts/tts_factory.py", line 29, in <module>
    import ibm_tts 
  File "/opt/mycroft/ibm-tts/mycroft-core/mycroft/tts/ibm_tts.py", line 20, in <module>
    from watson_developer_cloud import TextToSpeechV1
ImportError: No module named watson_developer_cloud

The module is already installed with pip on the current user. watson_developer_cloud is installed under python2.7/dist-packages. Apologies if I am doing something really stupid here.

aatchison commented 8 years ago

I'm guessing that you need to either pop install while the virtualenv is activated, or add it to requirements.txt and run dev_setup.sh again (as it enters the virtualenv.) The start scripts run inside the virtualenv as well.

The easy way: $ workon mycroft (mycroft)$ pip install -r requirements.txt

On Aug 5, 2016 10:23 AM, "chrisvella" notifications@github.com wrote:

Thank you for developing this fantastic feature. I can't wait to get it working.

A couple of things to note:

To merge the branch use: git clone -b features/ibm-tts https://github.com/thecodingwizard/mycroft-core.git https://github.com/thecodingwizard/mycroft-core.git
The python modules name is watson-developer-cloud rather than watson-developer-sdk
The file to edit the configuration is: /mycroft/configuration/ mycroft.ini

I am hitting an issue with the last step:

Open a new terminal. cd into mycroft-core. Run ./start.sh cli

I get the following error:

Traceback (most recent call last): File "/opt/mycroft/ibm-tts/mycroft-core/mycroft/client/text/cli.py", line 24, in from mycroft.tts import tts_factory File "/opt/mycroft/ibm-tts/mycroft-core/mycroft/tts/tts_factory.py", line 29, in import ibm_tts File "/opt/mycroft/ibm-tts/mycroft-core/mycroft/tts/ibm_tts.py", line 20, in from watson_developer_cloud import TextToSpeechV1 ImportError: No module named watson_developer_cloud

The module is already installed with pip on the current user. watson_developer_cloud is installed under python2.7/dist-packages. Apologies if I am doing something really stupid here.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/MycroftAI/mycroft-core/issues/268#issuecomment-237880147, or mute the thread https://github.com/notifications/unsubscribe-auth/AIJOOxW1lkBtfCu_LlIGbnV12q57i1rrks5qc1VzgaJpZM4JHa5q .

chrisvella commented 8 years ago

add it to requirements.txt and run dev_setup.sh again

I did that and it solved the issue. Thank you @aatchison

I now have a sound problem to debug. I will update on how the voices sound once its fixed.

chrisvella commented 8 years ago

This feature is a huge improvement to mycroft. When I demo mycroft to other people they always remark how robotic it sounds. No more! I would love to see this feature automatically configured for mycroft premium backers (at least till more voices are added at the end of this year).

thecodingwizard commented 8 years ago

@chrisvella Yup - typo on my part, it's not watson-developer-sdk, changing it now. And yes you have to install watson-developer-cloud in the proper virtual enviornment. I didn't know about requirements.txt... I'll play around with that.

This can be automatically configured for premium backers, but this service is pay-as-you-go with a free tier (the first 1 million or so characters are free, everything else costs money). Therefore someone will have to volunteer their credentials and possibly be charged a lot of money if a user abuses it.

I've been quite busy recently and will only be more busy as school starts :cry: so I'll see if I can wrap this project up soon

thecodingwizard commented 8 years ago

Another feature of this speech library is that it allows you to customize the way certain text is spoken. See https://www.ibm.com/watson/developercloud/doc/text-to-speech/SSML.shtml

For example, if you ask mycroft "stock quote for MSFT", it'll say "MSFT" in a strange way (some characters are spoken faster than others). The speech library allows you to overcome this by specifying MSFT to be spoken as individual characters. Just something I thought you should be aware of... if all skills returned properly formatted text (eg. with proper specifications, like "pronounce this number as a phone number" or "read this string as individual characters") it'll make MyCroft sound even more human-like

thecodingwizard commented 8 years ago

Anyone know where mycroft is defined?

in tts_factory.py, it imports tts classes like so:

from mycroft.tts import espeak_tts

How do I add my own tts class ibm_tts to mycroft.tts? Currently I'm importing ibm_tts like so:

import ibm_tts

thecodingwizard commented 8 years ago

I think except for the issue above regarding mycroft.tts, my code should be ready to merge. Tests & validation isn't setup yet, but I added watson-developer-cloud to requirements.txt & tested on Ubuntu 16.04

jgbreezer commented 7 years ago

Espeak does support SSML and I'd like to see that supported too, from that side its a case of just enabling it with a flag. I wrote some filtering for my own mycroft-like system that automatically added SSML to some patterns which did help a bit. You don't have to use it but aware skills could pass in an ssml flag as an "extra" to the message to indicate the speaking service should process it if it can (or pass in two versions in the message and the dumb non-ssml speech system could read that and the ssml-supporting ones could read the ssml version, should probably say which version of ssml too), or strip it out if it can't (which is relatively simple code).

forslund commented 7 years ago

@thecodingwizard (you have probably already figured this out but in case you haven't) Regarding the from mycroft.tts import xxx, this is a relative path from the top of the mycroft -core directory. If you place your file in the mycroft/tts directory it should be able to be included using

from mycroft.tts import imb_tts

See https://github.com/MycroftAI/mycroft-core/pull/367 for an example how I added a (semi)new tts.

Regarding SSML, it seems to be quite common but the support of the tagging varies.

aatchison commented 7 years ago

:bump:

forslund commented 6 years ago

I'm closing this issue since it's been inactive for ~1 year. For information, we've recently merged a PR for IBM Watson tts (#1261).

MycroftAI / mycroft-core

Feature: IBM Bluemix Text To Speech #268