andabi / deep-voice-conversion

Deep neural networks for voice conversion (voice style transfer) in Tensorflow
MIT License
3.92k stars 843 forks source link

May I ask you a question? #6

Closed TomGledhill closed 6 years ago

TomGledhill commented 6 years ago

Hello. Does intonation, loudness of speech and etc. taken into account while convertion? If you will change the input statement into a question only intonationally, will the output also change?

Thank you.

andabi commented 6 years ago

@day18s Hi. The source speaker's intonation, loudness will be gone because the connection between speeches is only phoneme at present. I consider it as advanced topics like catching source speaker's intonation when synthesizing target speaker's speeches. Any idea?

TomGledhill commented 6 years ago

Hello @andabi. Thanks for your reply! As far as I understand, the information on intonation is still availible at the mfccs stage. Isn't it? Not sure whether it is possible and will make any sense to convert mfccs.