Grapheme-to-Phoneme (G2P) conversion using attention based encoder-decoder models
We used the following datasets provided by Stanley Chen (stanchen@us.ibm.com):
Note - For CMUDict, it might be a good idea to use the newer version from here - https://raw.githubusercontent.com/cmusphinx/cmudict/master/cmudict.dict
python data_utils.py -data_dir DATA_DIR [-{train,dev,test}_file] {TRAIN,DEV,TEST}_FILE
python g2p.py -data_dir DATA_DIR -tb_dir BASE_MODEL_DIR [-eval]
Jointly learning to align and convert graphemes to phonemes with neural attention models by Shubham Toshniwal and Karen Livescu.
Here's the [BIBTEX] entry for citation ease.