olofson / audiality2

A realtime scripted modular audio engine for video games and musical applications.
http://audiality.org/
zlib License
79 stars 5 forks source link

Speech synthesis #67

Open olofson opened 10 years ago

olofson commented 10 years ago

While this should just be a typical application for Audiality 2, it would make a nice and useful example to include with the engine, and also a nice pilot project for the scripting engine and the API. Lip sync and other things might call for features that would be useful for other things as well.

olofson commented 9 years ago

Design idea... Three layers:

  1. Vocal tract synthesizer program. A program that is started in a silent state, optionally with a number of init arguments, and then responds to a standardized set of messages that control pitch and formants, trigger plosives etc. These programs would essentially be designed like (and also usable as) musical instruments. Not necessarily much speech synthesis specific about them, apart from the timbres (typically) being more or less humanoid.
    • We could even split this up further, allowing a vocal tract synth to be constructed from separate programs for vocal cords, different resonances, different plosive generators etc.
  2. Speech modulator program. An intermediate level that defines how phonemes are actually pronounced. Basically an interactive sequencer that's driven by phoneme messages, and send control messages to vocal tract synthesizers. We'll probably need some new datatypes and A2S constructs to implement this sensibly, if we are to do it in A2S at all.
  3. Dictionary, phrasing etc. This is probably best left to external libs, or even left out altogether, if we only need to deal with pre-composed words and messages.