zzw922cn / Automatic_Speech_Recognition

End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
MIT License
2.84k stars 538 forks source link

which language is it for? #9

Open LiYijin opened 7 years ago

LiYijin commented 7 years ago

Sorry I am a freshman in speech recognition I saw that "timit phonemes, it is 62; if timit characters, it is 29" I want to know is it for English? If for chinese, the number of phonemes and characters should be how many? really thanks!

brianlan commented 7 years ago

@LiYijin It's English.

brianlan commented 7 years ago

@zzw922cn regarding this, I also have a question. The num_classes for 'cha' is 29. But why it's 29 not 28? According to my understanding, we only have 26 letters + 2 special char (space and single-quote), right? I can see from this line of code, when ind == 28, it just passed.

zzw922cn commented 7 years ago

@LiYijin @brianlan Yes, my code is currently for English, but I would support Chinese in future. For Chinese, you can also get the phenome of a word, for example, the phoneme sequence of '安宁' is 'aa an1 n ing2'.

LiYijin commented 7 years ago

@brianlan @zzw922cn Thanks for you two. I know we can get the phenome of a chinese word, But I am still confused that I think it is a standard for the number of phonemes and characters. For chinese, the standard number of phonemes should be how many? or where could I find it?

mnottheone commented 7 years ago

@brianlan regarding 29th character, for decoding in general to handle repetition one extra character is used. Most of places it's character set + '_' . I will suggest you to read decoding section and pseudo code given here . Hope, it helps with your doubt.