festvox / festival

Festival Speech Synthesis System
Other
391 stars 58 forks source link

Feature hts performance #43

Closed zeehio closed 4 years ago

zeehio commented 4 years ago

This PR is based on #35 . Please review that one first.

This patch improves performance with HTS voices in two ways:

a) Instead of loading all HTS voice models on each utterance, it only reloads the models if the voice name has changed (i.e. a different voice is being used).

b) Disk usage is reduced: Before, speech parameters (mel cepstrum and logf0 coefficients) were written to a temporal file and removed. Also, HTS text features were passed to the HTS_Synthesize_Utt function by using a text file, and most importantly the generated wavefile was saved to a temporal file and reloaded from that file in an unnecessary intermediate step.

Now, by default speech parameters are not saved and deleted, HTS text features are passed using a string, and the generated wave data is passed to the utterance directly.

This patch is based on: