why can use Phoneme as input for the pretrain model?

NVIDIA / tacotron2

Tacotron 2 - PyTorch implementation with faster-than-realtime inference

BSD 3-Clause "New" or "Revised" License

5.07k stars 1.38k forks source link

why can use Phoneme as input for the pretrain model? #329

Closed ArtemisZGL closed 4 years ago

ArtemisZGL commented 4 years ago

I found the symbol in the original code has Phoneme. And When I use Phoneme as input to infer using the published checkpoint, it can work well. But I train the model for myself, and use Phoneme as input seems not work. And I can't find the english Grapheme to Phoneme conversion in the code. So I guess maybe the training data have both the Phoneme and Grapheme as text ? Did I miss something ?

AppalachianWine commented 4 years ago

This model uses char embedding.

rafaelvalle commented 4 years ago

@ArtemisZGL You can use the grapheme to phoneme pipeline in Flowtron's repo The Tacotron 2 we shared was trained both on graphemes and phonemes, hence the ability to perform inference with phonemes.