gromnitsky / read-aloud.el

A simple Emacs interface to TTS (text-to-speech) engines
MIT License
30 stars 8 forks source link


A simple interface to TTS engines.

The package uses an external text-to-speech engine (like flite) to pronounce the word at or near point, the selected region or a whole buffer.

A screenshot of running read-aloud.el



  1. Setup at least one of the supported TTS engines (see below).

  2. Add to ~/.emacs:

    (load-file "/the/repo/dir/read-aloud.el")


To stop reading at any time you either run any of the commands above again, or do M-x read-aloud-stop.

Supported TTS Engines


... is the default one in read-aloud.el. It contains a daemon that hides from the user all the details of a chosen tts engine. To communicate w/ the daemon, read-aloud.el employs spd-say CL util.

On Fedora 24:

# dnf install speech-dispatcher-flite speech-dispatcher-utils

$ mkdir ~/.config/speech-dispatcher
$ cp /etc/speech-dispatcher/speechd.conf !$
$ $EDITOR !$

For example:

$ grep '^[^#]' ~/.config/speech-dispatcher/speechd.conf
LogLevel  3
LogDir  "default"
DefaultRate  40
DefaultVolume 100
AudioOutputMethod "alsa"
DefaultModule flite

Test it:

$ spd-say hello


... is the easiest one to install & use. For example, on Fedora 24:

# dnf install flite

Test it:

$ echo hello | flite

Add to ~/.emacs:

(setq read-aloud-engine "flite")

Microsoft Speech API

Jampal provides a CL interface to SAPI. Install it, then test via:

> echo hello | cscript "C:\Program Files\Jampal\ptts.vbs"

Add to ~/.emacs:

(setq read-aloud-engine "jampal")

macOS Speech Synthesis

Add to ~/.emacs:

(setq read-aloud-engine "say")

By default it uses the settings from System Preferences.


To add/modify a tts engine, you'll need to edit read-aloud-engines plist. Here is the example for Windows:

(lax-plist-put read-aloud-engines "jampal.en"
  '(cmd "cscript"
        args ("C:\\Program Files\\Jampal\\ptts.vbs" "-r" "8")) )

args should be a list or nil. To select a new entry,

(setq read-aloud-engine "jampal.en")

The CL util that communicates w/ the engine must wait until the text was fully pronounced (e.g. not exit immediately), otherwise (read-aloud-buf) won't be able to distinguish whether it's time to feed the engine w/ another chunk of the text. This is why we use spd-say w/ -w CLO.

You can edit the face that (read-aloud-buf) uses w/ the usual

M-x customize-face RET read-aloud-text-face

A Smoke Test

After you have configured your system tts engine, do

M-x eval-expression RET (read-aloud-test) RET

It should open 2 tmp windows: 1 log window & 1 w/ a sample text, then it should start reading automatically. After it finishes you may safely kill those 2 buffers.

