going-digital / Talkie

Speech library for Arduino
305 stars 113 forks source link

TI-99 ROM / talkie.cpp coefficients mismatch? #6

Open jremington opened 10 years ago

jremington commented 10 years ago

Hi, Peter:

I really appreciate the work you've done on the Talkie library for AVR processors! I've tried out several of the libraries and the TI-99 set seems most useful, but it sounds pretty bad compared to some of the other sets.

I'm wondering if the coefficients built into talkie.cpp are correct or best for the TI-99 ROM. If not, do you have a better set?

BTW Tom's Diner sounds TERRIFIC, even compressed. I think there would be a lot of interest in open source C or C++ code for the PC to do what the MATLAB routines do.

Best regards, Jim Remington

going-digital commented 10 years ago

I've just converted the TMS5200 coefficients. These should make the TI-99 set sound better. They'll make the UK male set sound worse (that's definitely for the TMS5220). Not sure what they'll do to the other sets - they have unknown heritage. I can't test things at this end at the moment - so could you try this and report back?

// TMS5200 coefficients
// These values are derived from the MAME project.
// See http://mamedev.org/source/src/emu/sound/tms5110r.c.html
//
uint8_t tmsEnergy[0x10] = {
    0x00,0x01,0x02,0x03,0x04,0x06,0x08,0x0b,
    0x10,0x17,0x21,0x2f,0x3f,0x55,0x72,0x00
};
uint8_t tmsPeriod[0x40] = {
    0x00,0x0e,0x0f,0x10,0x11,0x12,0x13,0x14,
    0x15,0x16,0x17,0x18,0x19,0x1a,0x1b,0x1c,
    0x1d,0x1e,0x1f,0x20,0x22,0x24,0x26,0x28,
    0x29,0x2b,0x2d,0x30,0x31,0x33,0x36,0x37,
    0x39,0x3c,0x3e,0x40,0x44,0x48,0x4a,0x4c,
    0x51,0x55,0x57,0x5a,0x60,0x63,0x67,0x6b,
    0x70,0x75,0x7a,0x7f,0x85,0x8b,0x91,0x97,
    0x9d,0xa4,0xab,0xb2,0xba,0xc2,0xca,0xd3
};
// Coefficients below are shifted left 6 bits and in 2's complement form.
int16_t tmsK1[0x20] = {
    0x82c0,0x8380,0x8440,0x8580,0x86c0,0x8880,0x8ac0,0x8d40,
    0x9080,0x9440,0x9900,0x9ec0,0xa580,0xad40,0xb640,0xc0c0,
    0xcc40,0xd900,0xe680,0xf4c0,0x0340,0x1180,0x1f80,0x2cc0,
    0x3900,0x4400,0x4dc0,0x5640,0x5d80,0x63c0,0x6900,0x6d40
};
int16_t tmsK2[0x20] = {
    0xa200,0xa6c0,0xac40,0xb200,0xb880,0xbf80,0xc740,0xcf40,
    0xd7c0,0xe100,0xea40,0xf3c0,0xfd80,0x0740,0x1100,0x1a80,
    0x23c0,0x2c80,0x3500,0x3cc0,0x4400,0x4ac0,0x5100,0x5680,
    0x5b80,0x6000,0x6400,0x6780,0x6ac0,0x6d80,0x7000,0x7e80
};
// Coefficients below are shifted right 2 bits and in 2's complement form.
// Note that 2 bits of precision are lost, but 8 bit multiplicands are
// faster to compute.
int8_t tmsK3[0x10] = {
    0x9a,0xa1,0xa9,0xb2,0xbd,0xca,0xd8,0xe7,
    0xf6,0x06,0x16,0x25,0x34,0x40,0x4c,0x55
};
int8_t tmsK4[0x10] = {
    0xb8,0xc1,0xcc,0xd7,0xe4,0xf1,0xfe,0x0b,
    0x18,0x25,0x31,0x3c,0x46,0x4e,0x56,0x5d
};
int8_t tmsK5[0x10] = {
    0xb1,0xb9,0xc3,0xce,0xd9,0xe5,0xf2,0xff,
    0x0c,0x19,0x26,0x31,0x3c,0x46,0x4e,0x56
};
int8_t tmsK6[0x10] = {
    0xd0,0xda,0xe5,0xf0,0xfb,0x07,0x12,0x1d,
    0x28,0x32,0x3b,0x44,0x4b,0x53,0x59,0x5e
};
int8_t tmsK7[0x10] = {
    0xc1,0xca,0xd3,0xdd,0xe8,0xf3,0xfe,0x09,
    0x14,0x1f,0x29,0x33,0x3c,0x45,0x4c,0x53
};
int8_t tmsK8[0x08] = {
    0xcd,0xe4,0xfe,0x17,0x2f,0x43,0x54,0x61
};
int8_t tmsK9[0x08] = {
    0xc2,0xd2,0xe5,0xf8,0x0c,0x20,0x32,0x41
};
int8_t tmsK10[0x08] = {
    0xd1,0xdf,0xee,0xfe,0x0d,0x1d,0x2b,0x39
};

As for the compressor - I'll probably have a go at that in the near future. Stay tuned to the project.

jremington commented 10 years ago

Hi, Peter:

The new set really helps for the TI-99 ROM (had to add a few missing semicolons after the braces). Thanks! However, it does makes the Vocab_US_Large sound worse. I'll try some of the others and report back.

I'm confused by your seemingly contradictory comments about TMS5220 above, which set of coefficients is for which synthesizer chip?

I spent some time looking through the MAME code but couldn't quite see the relationship between your coefficients and what is in there. Could you offer a bit of explanation?

Again, I REALLY appreciate your efforts -- this is fun!

Cheers, Jim

going-digital commented 10 years ago

Ah. Semicolons - apologies for that. Comment above edited to add them.

Condensed summary of the Texas Instrument speech chips: Speak and Spell was bug-fixed to make the TMS52 0 0, as used in the TI-99/4A Speech synth. Up until then, the speech chips were for use on TI products only. The TMS5200 was bugfixed to make the TMS52 2 0 which was made commercially available, and used in lots of non-TI products including the Acorn Speech synth.

The main difference between the TMS5200 and the TMS5220 are the coefficient lookup tables. Talkie ships with TMS5220 tables, but using the coefficients above will emulate the TMS5200.

Vocab_US_TI99 was designed to be played on a TMS5200 (The TI-99/4A CP1500 speech synth module) Vocab_UK_Acorn was designed to be played on the TMS5220 (The Acorn Speech synthesiser add-on) Vocab_Soundbites was designed to be played on the TMS5220 (probably encoded with QBOX Pro, which ships with TMS5220 coefficients only) The other vocabularies are random speech ROMs off the 'net, with unknown heritage. I speculate they were also encoded for the TMS5220.

As for the coefficient encoding, here are some worked examples.

The TMS5200 coefficient block starts at line 331 of http://mamedev.org/source/src/emu/sound/tms5110r.c.html . Energy and pitch are hex versions of the arrays at lines 340 and 343. K1 coefficients start at line 353. The first number is -501. Convert to binary: -1 1111 0101. Shift left 6 bits: -111 1101 0100 0000 For negative numbers, convert to 2's complement form: 1 0000 0000 0000 0000 - 111 1101 0100 0000 = 1000 0010 1100 0000 Convert to hex: 0x82c0

K2 coefficients are done the same way.

K3-K10 coefficients are done slightly differently as they are encoded in 8 bits. Taking the first K3 coefficient (look at line 363 of the MAME source): The first number is -407. Convert to binary: -1 1001 0111 Shift right 2 bits: -110 0101.11 (note the decimal binary point), round to nearest whole number: -110 0110 For negative numbers, convert to 2's complement form: 1 0000 0000 - 110 0110 = 1001 1010 Convert to hex: 0x9a

That's a bit laborious, so I used a spreadsheet to do the maths. The nice thing about using coefficients encoded in this way is you can easily simulate multiplying by a fractional number by an integer multiply followed by a binary shift. That, plus the hardware multiplier built into the AVR chips, is why Talkie is fast enough to work in real time.

Does that clarify matters?

(And thanks for the appreciation - its very satisfying to know others are having as much fun I as am from projects like this!)

jremington commented 10 years ago

Thanks for the explanation on the coefficients. The effect of the different coefficients makes perfect sense now! Of all the examples, the TI-99 vocabulary seems to be the only one that needs the TMS5200 coefficient set.

Finally, for a 20 MHz clock speed, in "talkie.cpp", subroutine Say, I had to change the delay constant from 25 to 10 to get the best voice cadence. I think most of the 15 ms (out of the 25 ms sample period) is spent in the interrupt routine, calculating the new PWM value.

going-digital commented 10 years ago

Yes. The ROMs are designed for an 8kHz audio sample rate, and for that time constant I've assumed an 16MHz clock in the code. By the way, if you're wanting speech on an 8MHz clock device, (or you need more free CPU time during speech), have a look at my st2_talkie project. I've trimmed back some of the calculation precision, but it runs much faster.