Alexander-H-Liu / End-to-end-ASR-Pytorch

This is an open source project (formerly named Listen, Attend and Spell - PyTorch Implementation) for end-to-end ASR implemented with Pytorch, the well known deep learning toolkit.
MIT License
1.18k stars 317 forks source link

Encoding Target #18

Open MishaimMalik opened 5 years ago

MishaimMalik commented 5 years ago

In Step 0, you mentioned we can use one of the following options phoneme/char/subword/word. But when I choose "word" instead of "subword". The encoding doesn't recognize it. The error is in line 134 of preprocess_librispeech.py. It occurs in the function read_text().

Can we apply the same encoding as subword on wor"?

Also for the subword option the bpe.vocab file is missing (in case of LibriSpeech). Do we have to generate it ourselves? If, yes then how?

jybaek commented 5 years ago

I think you need to modification read_text function. I modified some of the code to fit the format in which the data exists, so it worked. For example, I don't have .trans.txt. Every trans exists individually.