Open theabc123 opened 7 years ago
See https://github.com/CSTR-Edinburgh/merlin/blob/a5c0cd9baef50447188b59ffeda9f374678144e6/tools/WORLD/test/analysis.cpp#L311. It seems that frame period is hard-coded (though it should be easy to be configuarable).
@r9y9 Thank you for your answer! I just updated my issue please see it.
I think there are a few places that need to be changed, e.g. https://github.com/CSTR-Edinburgh/merlin/blob/b74abe4b54a1c34f6c8cdf4464b159b867affa50/tools/WORLD/test/synth.cpp#L293. I haven't looked into config files yet, though.
I changed the framerate in the two files analysis.cpp et synth.cpp and recompiled. After that I used the copy_synthesis.sh to generate acoustic features and regenerate my wav files successfully (the number of generated frames is good also). I think the problem is not the vocoder, but a parameter given to the models. I am trying to look at src/configuration/configuration.py line 363 I changed frameshift parameter in my acoustic config file with no success.
If you change the frameshift for acoustic features, you need to change the same for linguistic features as well. check label normalisation script.
Hi ronanki, I changed the frame rate in:
I changed the World vocoder framerate to work with 4msec frame intervals and the resynthesized audio files are in the good speed. I copied the acoustic features in the models data directories and retrained the models, but the results is not correct, the speed of the speech is very high. I tried using frameshift : 4 parameter in the config file but it doesn't work. Any help would be appreciated. Thanks!