viveksck / simplicity

Code and Data for Simple Models for Word Formation in English Slang
9 stars 3 forks source link

About IPA alphabet #1

Open loretoparisi opened 6 years ago

loretoparisi commented 6 years ago

Thanks a lot for this amazing and inspiring work. I'm currently working on a Tensor2Tensor like LSTM encoder/decoder G2P, but using the CMU 2 IPA dictionary / alphabet. In your model you are using the standard CMU/Arpabet, but what about using IPA instead? - see https://github.com/loretoparisi/docker/tree/master/g2p-seq2seq

viveksck commented 6 years ago

Thanks for letting me know! When we were writing the paper, we used a pre-trained model (on the standard CMU data-set) mainly due to timing constraints. Indeed, we note in our paper that using IPA might improve the models (since stress is accounted for). We will be looking at the IPA based models you pointed out as well! Thanks!

loretoparisi commented 6 years ago

@viveksck you are welcome! I will let you know the results of the model training. We are now working on the Tensor2Tensor Seq2Seq architecture by CMU and the CMU 2 IPA dict.