it-muslim / kaldi-helpers

Helper scripts to work with Kaldi
MIT License
6 stars 0 forks source link

Phoneme from forced alignment is not in the gmm likes file #1

Closed rguliev closed 5 years ago

rguliev commented 5 years ago

From Yondu Tsai from kaldi-help google group.

If the phoneme from forced alignment in the alignment file is not in the gmm likes file, then there is a lookup error

'ali_prob': prob.lookup(np.arange(prob.shape[0]), ali),

rguliev commented 5 years ago

Hi Yondu,

How has that happened, that gmm likes does not have some phoneme? As far as I know, gmm-compute-likes provides probs for each pdf_id which are then mapped to phoneme. Could you provide more information?

dpny518 commented 5 years ago

attached are the ali-to-phones and gmm-compute-likes for the audio and text hello one two three words.txt

ali-to-phones.txt gmm-compute-likes.txt

phones.txt

rguliev commented 5 years ago

Thanks. Could you also provide a mapping of pdf_id->phoneme, i.e output of pdf2phone function? It plays an important part in this case.

For convenience, a summary of the files:

dpny518 commented 5 years ago

pdf2symb.txt

rguliev commented 5 years ago

Thanks. So the problem is that pdf_id can't map to phoneme, thus pdf2symb is incorrect. For example:

Transition-state 1212: phone = 'aU_B hmm-state = 0 pdf = 1999
Transition-state 1234: phone = 'aU_E hmm-state = 0 pdf = 1999
Transition-state 1256: phone = 'aU_I hmm-state = 0 pdf = 1999
Transition-state 1278: phone = 'aU_S hmm-state = 0 pdf = 1999
Transition-state 1300: phone = aU_B hmm-state = 0 pdf = 1999
Transition-state 1322: phone = aU_E hmm-state = 0 pdf = 1999
Transition-state 1344: phone = aU_I hmm-state = 0 pdf = 1999
Transition-state 1366: phone = aU_S hmm-state = 0 pdf = 1999

As a workaround, you can remove _B, _E, _I, _S suffices and ' prefix. Then each pdf_id can be mapped to one phoneme. Here is an example code.

import pandas as pd
df = pd.read_csv("pdf2symb.txt", header=None, sep=" ", usecols=[4,10])
df["parsed_phone"] = (df.phone
                     .apply(lambda x: x.split("_")[0]) # Remove suffixes
                     .apply(lambda x: x[1:] if x.startswith("'") else x) # Remove `'`
                     )
df = df[["pdf", "parsed_phone"]].drop_duplicates()
pdf2symb = dict(zip(df["pdf"].astype(int), df["parsed_phone"]))
rguliev commented 5 years ago

@yondu22 Any updates here?)

dpny518 commented 5 years ago

I just changed the model to be same as yours instead of using mine

rguliev commented 5 years ago

Which model do you mean? Can I close the issue then?:)

dpny518 commented 5 years ago

Yes you can cLose, train model with phone position false