barronalex / Tacotron

Implementation of Google's Tacotron in TensorFlow
236 stars 80 forks source link

Another reason to go multi speaker now #18

Open GunpowderGuy opened 7 years ago

GunpowderGuy commented 7 years ago

https://voice.mozilla.org/

matanox commented 7 years ago

A contrarian view:

I humbly think that just improving the audio quality, which Baidu report in their deep voice 2 article, would be more useful than adding multiple speakers. The voice quality reported on their multi-speaker architecture is very low (around 2.5 out of 5, aggregated by Amazon Turkers), whereas the improvement they report over the original Tacotron voice quality is substantial!

You may observe that beginning from Table 1 in https://arxiv.org/pdf/1705.08947.pdf

In their article, they describe the modification to the original Tacotron, which reportedly makes the difference.