Thanks @patrickvonplaten for this repo, it really helped a lot!
Just a question here, what is the best language model for CTC decoding? is it a character-level or word-level language model? I am assuming a character level should be the choice as wav2vec decodes characters. However, it seems that the practice is to use a word-level one. I notice that in many repos and posts. Please correct me if I am wrong. Also, if so, can you please elaborate on why word-level language models are preferred over char-level ones?
Thanks @patrickvonplaten for this repo, it really helped a lot!
Just a question here, what is the best language model for CTC decoding? is it a character-level or word-level language model? I am assuming a character level should be the choice as wav2vec decodes characters. However, it seems that the practice is to use a word-level one. I notice that in many repos and posts. Please correct me if I am wrong. Also, if so, can you please elaborate on why word-level language models are preferred over char-level ones?