gromnitsky / read-aloud.el

A simple Emacs interface to TTS (text-to-speech) engines
MIT License
32 stars 8 forks source link

[Feature] Configuration of the speech engine (rate + language) #4

Open liar666 opened 1 year ago

liar666 commented 1 year ago

Hi

Thanks for the tools, it's really nice.

Combined with Whisper for Speech2Text and emacs-aichat /BingChat, I almost configured an audio dialog with ChatGPT, that's so neeeaaaat.

One thing I lack is the ability to change the rate and the language of the output dynamically, so that I can work in multiple language in different buffers. I've already set up Whisper for that and GPT can work in any language, so it only lacks support in the TTS part now.

Since most speech engines support such options, offering these options as configurable parameters would be great.

benthamite commented 6 months ago

@liar666 , you can find an attempt to do this here. It only works with say, the macOS engine. The relevant commands are read-aloud-extras-set-rate and read-aloud-extras-set-voice.

Unfortunately, for some reason read-aloud does not work when a new voice is set, although the value of read-aloud-engines appears to be set correctly

E.g., this value of read-aloud-engines works:

("speech-dispatcher"
 (cmd "spd-say" args
      ("-e" "-w")
      kill "spd-say -S")
 "flite"
 (cmd "flite" args nil)
 "jampal"
 (cmd "cscript" args
      ("C:\\Program Files\\Jampal\\ptts.vbs" "-r" "5"))
 "say"
 (cmd "say" args
      ("-r 250")))

whereas this one throws an error:

("speech-dispatcher"
 (cmd "spd-say" args
      ("-e" "-w")
      kill "spd-say -S")
 "flite"
 (cmd "flite" args nil)
 "jampal"
 (cmd "cscript" args
      ("C:\\Program Files\\Jampal\\ptts.vbs" "-r" "5"))
 "say"
 (cmd "say" args
      ("-v Albert" "-r 250")))

"error in process sentinel: read-aloud ended w/ the event: exited abnormally with code 1"

Perhaps @gromnitsky knows what’s going on?