Open pauljskim opened 6 years ago
For the front-end: You can use any front-end that can produce HTK-style phone or state-level labels. Just use that front-end to make them instead of Festival. For the vocoder: You should changes the Merlin configuration file for the acoustic model to output the appropriate streams for your vocoder of choice and also change the parameter dimensions. Then extract parameters using your vocoder of choice.
The language processor is separate from Merlin so you can use any TTS system to produce your HTS-style label files.
To use a different vocoder, you can add it to merlin/tools, add a feature extraction script to merlin/misc/scripts/vocoder and add a section in configuration/configuration.py to define the features and dimensions. For instance @bajibabu has added Helsinki's GlottHMM vocoder and I have tested it in the past. See his repo https://github.com/bajibabu/merlin
https://github.com/bajibabu/merlin is a bit detached from current master, but @bajibabu just added a pull request with same functionality. Check https://github.com/CSTR-Edinburgh/merlin/pull/233 There's support for GlottHMM and also for @gillesdegottex 's excellent https://github.com/gillesdegottex/pulsemodel
Also, I'd like to make the distinction that the GlottHMM vocoder is mostly Aalto University ;) (same city, different Uni)
You are right, I didn't say University of Helsinki, just Helsinki, but I did mean to say Aalto university. Keep up the good work!
I would like to change the language processor and vocoder in merlin.
However, I do not know which part of the code to fix,
I would appreciate your advice.
Thank you.