PyThaiNLP / pythaiasr

Python Thai Automatic Speech Recognition
Apache License 2.0
59 stars 13 forks source link

[TODO] Make LM for ASR #1

Closed wannaphong closed 1 year ago

wannaphong commented 3 years ago

Add lm for ASR

tann9949 commented 2 years ago

Just some useful information here.

@tanamettpk from techcast suggested that pyctcdecode is applicable to do a beam search given a sequence of model logits and n-gram LM (.arpa format).

In my opinion, pyctcdecode should work just fine, but I saw many official ASR libraries like NVIDIA's NeMo, deepspeech2, and others use baidu's ctc-warp. Therefore, I'm not sure which library should be used for the LM rescoring as pyctcdecode might be easier to implement but will definitely be slower that ctc-warp as it is implemented in C++.

All in all, I just want to provide you a quick guidelines for your libraries to look for:

PS. In case that you work on one of these libraries, please reply to my comment. I would like to implement another library that you will not choose, and let's compare if there's any significant difference.