cmusphinx / g2p-seq2seq

G2P with Tensorflow
Other
667 stars 196 forks source link

Unexpectedly long outputs #167

Open joshhansen opened 5 years ago

joshhansen commented 5 years ago

I'm finding repeatedly that the g2p-seq2seq model generates strangely long pronunciations using the included model. For all sequences up to three letters long, the following strange outputs occur:

ysl output: IY EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH

ybr output: IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B

xsn output: EH K S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH

wsq output: D AH B AH L Y UW EH S K Y UW EH S K Y UW EH S K Y UW EH S K Y UW EH S K Y UW EH S K Y UW EH S K Y UW EH S K Y UW EH S K Y UW EH S K Y UW EH S K Y UW EH S K Y UW EH S K Y UW EH S K Y UW EH S K Y UW EH S K Y UW EH S K Y UW EH S K Y UW EH S K Y UW EH S

wsk output: D AH B AH L Y UW EH S K Y UW EH S K Y UW EH S K Y UW EH S K Y UW EH S K Y UW EH S K Y UW EH S K Y UW EH S K EY

wjr output: W JH UW JH AH B AH L Y UW Y UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW JH UW

vsl output: V IY EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S

ssc output: EH S EH S S S S S S S S S S S S S S S S S S S S S S IY

qsn output: K Y UW EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH

qrk output: K Y UW EH R K Y UW EH R K Y UW EH R K Y UW EH R K Y T IH K Y UW EH R K Y UW EH R K Y UW EH R K Y UW EH R K Y UW EH R K Y UW EH R K Y UW AA R K Y UW AA R K Y UW AA R K Y UW EH R K Y UW AA R K Y UW EH R K Y UW EH R K Y UW EH R K Y UW EH R K Y UW EH R

nqn output: EH N D IY EH N Y UW EH N Y UW EH N Y UW EH N Y UW EH N Y UW EH N Y UW EH N Y UW EH N Y UW EH N D IY EH N Y UW EH N D IY EH N Y UW EH N Y UW EH N Y UW EH N Y UW EH N Y UW EH N D IY EH N D IY EH N Y UW EH N D IY EH N Y UW EH N Y UW EH N D IY EH N Y UW EH N D IY

lqr output: EH L K Y UW EH L Y UW EH L K Y UW EH L K Y UW EH L K Y UW EH L K Y UW EH L K Y UW EH L K Y UW EH L K Y UW EH L K AA R

But these are all fairly arbitrary. Actual words get such results, too:

uncleanness output: AH N K L IY N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N K L IY S

micrometeorological output: M AY K R OW M IY T AO R AA L AO JH IH K AH L AA L AA L AA L AA L AA L AA L AA L AA L AA L AA L AA L AA L AA L AA L AA L AA L AA L AA L AA L AA L AA L AA L AA L AA L AA JH IH K AH L AA L AA L AA L AA L AA L AA L AA L AA L AA L AA L AA L AH L AA L AA JH IH K AH L AH JH IH K AH L AA L AA L AA L AA L AA L

quadrituberculate output: K W AA D R AH T UW B ER K Y UW B ER K Y UW B ER K Y UW B ER K Y UW B ER K Y UW B ER K Y UW B ER K Y UW AE T

unexceptionableness output: AH N IH K S EH P SH AH N AH B AH L IY N AH L N AH L N AH L N AH L AH L AH L N AH L N AH B AH L AH L AH L AH L AH S

The recurring theme seems to be that for whatever reason these words get stuck in a loop for a long time.

These are pretty rare, but are so egregiously bad that it makes me wonder if there is a bug somewhere? If not, guidance would be appreciated on how to train a model that avoids these issues.

nshmyrev commented 5 years ago

it depends on tensor2tensor version, they break it every month

vijay120 commented 4 years ago

I am facing a similar issue as well:

> kittipeumpoonwong
S IH T IY P IY AH M P UW N W AO N W AO N W AO N W AO N W AO N W AO N W AO N W AO N W AO N W AO N W AO NG

Is this a model issue or a bug in the decoder code?

I tried using the suggestion that it might be due to the tensor2tensor lib but I am getting the same results for tensor2tensor==1.6.6 and tensor2tensor==1.7.0

vijay120 commented 4 years ago

@joshhansen I solved this issue by adjusting the beam size of the decoding from 1 to 5.

g2p-seq2seq --decode wordlist.txt --model_dir g2p-seq2seq-model-6.2-cmudict-nostress --return_beams --beam_size 5

ysl IY EH S EH S EH L

ysl IH S AH L

ysl IY EH S EH S EH S EH S EH S EH L

ysl IY EH S EH S EH S EH S EH L

ysl IY EH S EH S EH S EH L

ybr W AY B ER

ybr IH B ER

ybr IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY

ybr IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY

ybr IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY B IY

xsn EH K S EH S EH N

xsn EH K S EH S EH S EH S EH N

xsn EH K S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH N

xsn EH K S EH S EH S EH N

xsn EH K S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH S EH N

wsq D AH B AH L Y UW EH S K Y UW EH S K Y UW

wsq D AH B AH L Y UW EH S K Y UW

wsq D AH B AH L Y UW EH S K Y UW EH S K EY

wsq D AH B AH L Y UW EH S IY

wsq D AH B AH L Y UW EH S K Y UW EH S K

wsk D AH B AH L Y UW EH S K Y UW EH S K EY

wsk D AH B AH Y UW EH S K Y UW EH S K EY

wsk W EH S K

wsk D AH B AH L Y UW EH S K Y UW EH S K Y UW EH S K EY

wsk D AH B AH L Y UW EH S K Y UW EH S K Y UW EH S K Y

wjr W ER

wjr W AA R

wjr W AY R

wjr W JH UW N Y ER

wjr D AH B AH L Y UW JH UW JH IY AA R

lqr EH L K Y UW EH S AA R

lqr EH L K Y UW EH R

lqr EH L K Y UW EH L AA R

lqr EH L K Y UW EH L Y ER

lqr EH L K Y UW EH L Y UW AA R