MycroftAI / mimic1

Mycroft's TTS engine, based on CMU's Flite (Festival Lite)
https://mimic.mycroft.ai
Other
799 stars 152 forks source link

Mimic - The Mycroft TTS Engine

Build Status codecov.io Coverity Scan

Mimic is a fast, lightweight Text-to-speech engine developed by Mycroft A.I. and VocaliD, based on Carnegie Mellon University’s Flite (Festival-Lite) software. Mimic takes in text and reads it out loud to create a high quality voice.

Official project site: mimic.mycroft.ai

Supported platforms

Untested

Future

Requirements

This is the list of requirements. Below there is the commands needed on the most popular distributions and supported OS.

Linux

On Debian/Ubuntu
$ sudo apt-get install gcc make pkg-config automake libtool libasound2-dev
On Fedora
$ sudo dnf install gcc make pkgconfig automake libtool alsa-lib-devel
On Arch
$ sudo pacman -S --needed install gcc make pkg-config automake libtool alsa-lib

Mac OSX

Windows

Cross compiling:

The fastest and most straightforward way to build mimic for windows is by cross-compilation from linux. This requires some additional packages to be installed.

On Ubuntu 18.04 (bionic):

sudo apt-get install gcc make pkg-config automake libtool libpcre2-dev wine-stable binutils-mingw-w64-i686 mingw-w64-i686-dev gcc-mingw-w64-i686

On Ubuntu 16.04 (xenial):

sudo apt-get install gcc make pkg-config automake libtool libpcre2-dev wine binutils-mingw-w64-i686 mingw-w64-i686-dev gcc-mingw-w64-i686

On Ubuntu 14.04 (trusty):

sudo apt-get install gcc make pkg-config automake libtool mingw32 mingw32-runtime wine

Native Windows building

Build

On a native build (not cross-compilation)

Cross compilation:

./run_testsuite.sh winbuild
wine ./mimic.exe -t "hello world" 

You can distribute the compiled mimic by adding to a zip file everything in the install/winbuild/bin directory.

Usage

By default mimic will play the text using an audio device. Alternatively it can output the wave file in RIFF format (often called .wav).

Read text

Read text from file

Change voice

Notes

Other options

Voices accept additional debug options. specified as --setf feature=value in the command line. Wrong values can prevent mimic from working. Some speech modelling techniques may not implement support for changing these features so at some point some voices may not provide support for these options. Here are some examples:

See lang/cmu_us_kal/cmu_us_kal.c) to see some other features and values.

Say the hour

Benchmarking

How to Contribute

For those who wish to help contribute to the development of mimic there are a few things to keep in mind.

Git branching structure

We will be using a branching struture similar to the one described in this article

In short

Coding Style Requirements

To keep the code in mimic coherent a simple coding style/guide is used. It should be noted that the current codebase as a whole does not meet some of these guidlines,this is a result of coming from the flite codebase. As different parts of the codebase are touched, it is the hope that these inconsistancies will diminish as time goes on.

Vimrc

For those of you who use vim, add this to your vimrc to ensure proper indenting.

"####Indentation settings
:filetype plugin indent on
" show existing tab with 4 spaces width
:set tabstop=4
" when indenting with '>', use 4 spaces width
:set shiftwidth=4
" On pressing tab, insert 4 spaces
:set expandtab
" fix indentation problem with types above function name
:set cinoptions+=t0
" fix indentation of { after case
:set cinoptions+==0
" fix indentation of multiline if
:set cinoptions+=(0   "closing ) to let vimrc hylighting work after this line

"see http://vimdoc.sourceforge.net/htmldoc/indent.html#cinoptions-values
"for more indent options
Indent command (currently does not indent switch/cases properly)
indent [FILE] -npcs -i4 -bl -Tcst_wave -Tcst_wave_header -Tcst_rateconv \
      -Tcst_voice -Tcst_item -Tcst_features -Tcst_val -Tcst_va -Tcst_viterbi \
      -Tcst_utterance -Tcst_vit_cand_f_t -Tcst_vit_path_f_t -Tcst_vit_path \
      -Tcst_vit_point -Tcst_string -Tcst_lexicon -Tcst_relation \
      -Tcst_voice_struct -Tcst_track -Tcst_viterbi_struct -Tcst_vit_cand \
      -Tcst_tokenstream -Tcst_tokenstream_struct -Tcst_synth_module \
      -Tcst_sts_list -Tcst_lpcres -Tcst_ss -Tcst_regex -Tcst_regstate \
      -Twchar_t -Tcst_phoneset -Tcst_lts_rewrites -Tlexicon_struct \
      -Tcst_filemap -Tcst_lts_rules -Tcst_clunit_db -Tcst_cg_db \
      -Tcst_audio_streaming_info -Tcst_audio_streaming_info_struct -Tcst_cart \
      -Tcst_audiodev -TVocoderSetup -npsl -brs -bli0 -nut

Acknowledgements

see ACKNOWLEDGEMENTS

License

See COPYING