going-digital / Talkie

Speech library for Arduino
305 stars 113 forks source link

any information on the data format? #11

Open ghost opened 9 years ago

ghost commented 9 years ago

i'm trying to write a dynamic synthesis and have enough linguistic knowledge about phonemes, transcription, lpc, etc. and would either use a little neural net to translate the text directly into the target coefficients or do it by rules. before doing so, i need to understand what i would output (in terms of koefficients, energy and repetitions and possibly more?). however, i tried to understand the file format from the code. it seems do be compressed somehow since, voiced (ten) and unvoiced (four) sounds get different numbers of coefficients etc. i also don't quite get how to encode repetitions and energy.

would be great to have more info. thanks!

going-digital commented 9 years ago

The format is identical to the Texas Instruments format, as used on their speech chips.

Compression basically works like this:

Keep any eye out for the bit order within a byte - the format is bit oriented, and byte based formats use inconsistent ordering.

The only easily available software that encodes recorded speech already is QBox Pro - it's old, and I've never got it running properly. It's floating around the web in various places. A modern open source way of generating this format would be awesome, and be welcomed by many communities.

I identified two shortcuts you might want to investigate if you're interested in rule based text to speech:

Bear in mind that english to phoneme to coefficient mapping is likely to take at least 8K of code and data - quite a chunk of Arduino code space.

jscuster commented 2 years ago

Just curious if anything has changed on on-the-fly tts?

I'm blind, I grew up with an Apple II E computer. The computer had an Echo II from Street Electronics installed. This is an expansion card based on the same chip.

A program, Textalker, read changes on the screen. There is an emulator of this in action on several disk images at https://bluegrasspals.com/blindapple/.

The point is that Textalker generated rule-based speech on the fly. I'm still learning how things work, but it may also be helpful.

As I said, I'm blind, so I'm very interested in finding a tts library like this.

Thanks for reading.

radiohound commented 1 year ago

@jscuster You might be interested in the work that has been done to port espeak-ng to arduino. It is located here: https://github.com/pschatzmann/arduino-espeak-ng He has it working on an ESP32.

Walter